Shop OBEX P1 Docs P2 Docs Learn Events
Is 8 bits enough for digital voice? — Parallax Forums

Is 8 bits enough for digital voice?

I might have to learn pasm because sampling audio would require it.
Digital audio is a brand new area for me. I don't mind quality lower than telephone voice. I just want to know if one byte carries enough data.
I'll be satisfied with my original goal of wireless motor control but being able to speak through a wireless robot would just be 'over the top' fun.

Comments

  • Heater.Heater. Posts: 21,230
    Sure, 8 bits is quite enough to get intelligible speech.

    Next thing to worry about is the sample rate. 8KHz should do.
  • You can actually get recognizable audio in 1 bit with a higher sampling rate. 4 bits at 8000hz is "dirty" but recognizable. 8 bit samples were very common for a long time on the Amiga, and were the common form for MOD files. It won't be CD quality audio, but 8 bit sounds quite good.

    J
  • potatoheadpotatohead Posts: 10,261
    edited 2015-12-23 18:21
    8 bits can sound as good as 16 bits in most cases. The difference is the noise floor. 8 bits is something like 40 to 50db. Plenty for voice.

    Be sure and do at least 8khz, if not 16khz sampling.

    Voice works in as little as 3 to 3.5khz of bandwidth, meaning 8khz sampling is near the minimum. Optimal voice reproduction happens at about 8khz, ideally 10khz of audio bandwidth.

    Many of the harmonics that clarify voice and get away from a nasal or muffled sound exist above 5khz, which is a very good compromise.

    If you are sampling voices, be sure to compress, equalize and normalize your samples. You want them full and loud to insure all the good stuff is above that -40db or so noise floor. That is the single most important thing you can do to really maximize 8 bit samples. People hear the signal, and when it is well differentiated from noise, the minor artifacts inherent in 8 bit samples are largely ignored by most listeners.

    For the eq, use a hard roll off at half your sample rate and maybe punch the highs and lows up 5db or so, depending on the voice. For men, a little 100hz and 3khz boost is good. For most wonen, 300hz and 3.5 to 4khz is good. If you aren't sure, leave it flat and employ aggressive compression.

    A nice mic really helps.

    Audacity does all this nice and easy. You can optionally record samples at a higher bitrate, do the processing, then output at the target rate. (Recommended ) This gives you a little room to get a good sample and process without excessive ringing and artifacts, both of which will make the voice muddy and or nasal sounding.

    Google AM radio production for some tips on this. For modest sample rates and depths, they have it right.

    Also, if possible, make sure your playback device has good response below 3khz. This makes a world of difference. Often, just adding a little mass (more substantial speaker and enclosure) and a firm mounting will do this for many enclosures.

    You don't have to do all of that, but it's there fo those who might want to maximize 8 bit samples.

    A byte carries plenty of data. Just make whatever you speak, loud. :)
  • Heater, JasonDorie and Potatohead, that is great news. Now I know the journey is worth the work. Thank you.
  • Duane DegnDuane Degn Posts: 10,588
    edited 2015-12-24 04:47
    There's a cool program in the Propeller Tool's "_Demos" folder named "microphone_to_headphones.spin". It takes the sound from a microphone digitizes it and then sends it out through the headphone jack.

    I think it will give you a good idea of what sort of sound quality you can get with 8-bits.

    Here are some comments from the program.
    ' At 80MHz the ADC/DAC sample resolutions and rates are as follows:
    '
    ' sample   sample               
    ' bits       rate               
    ' ----------------              
    ' 5       2.5 MHz               
    ' 6      1.25 MHz               
    ' 7       625 KHz               
    ' 8       313 KHz               
    ' 9       156 KHz               
    ' 10       78 KHz               
    ' 11       39 KHz               
    ' 12     19.5 KHz               
    ' 13     9.77 KHz               
    ' 14     4.88 KHz
    

    The program defaults to 11-bit audio but as you can see, the same bits go as low as 5-bits.

    It's been a few years since I've played with the program but I thought it was pretty cool.

    Of course the PASM code will give you a good head start with your project.

    You'll need to figure out how to modify the sample rate if you want to reduce the amount of data required to send to a robot.
  • I just remembered, Mark_T posted code to allow a Nordic nRF24L01+ transceiver send audio.

    I've never used it myself. The last time I looked at it, I didn't understand how to use it. Mark_T didn't include a demo top object.
  • Duane Degn wrote: »
    There's a cool program in the Propeller Tool's "_Demos" folder named "microphone_to_headphones.spin". It takes the sound from a microphone digitizes it and then sends it out through the headphone jack.

    I'll study that object line by line alongside the Propeller manual.
    BTW, I've drawn a 'lot' from your work. I wish you would post it to the OBEX. It's been very helpful. Thanks.
    I've learned even more from Erlend's work. I'm pretty comfortable with what I can do so far.
    I just desoldered the joysticks from a usb joypad to plug into my breadboard so that's what I'm working on at the moment.
  • jac_goudsmitjac_goudsmit Posts: 418
    edited 2015-12-27 04:33
    Voice connections usually use 8kHz sample rates but with 8 bits the quantization noise and lack of dynamics is a problem. Most systems I know use (or can use) 16 bit sampling combined with uLaw* or aLaw (G.711) compression. This is a very lightweight compression/decompression algorithm that converts each 16 bit sample to an 8 bit value. It's lossy but it still sounds pretty good for voice. Sample code should be easy to find on the Web. Some implementations use a table, but I don't think a Propeller would have enough memory space for the table; nevertheless I'm pretty sure a Prop is fast enough to run an aLaw/uLaw compressor at real-time or faster.

    ===Jac

    *(The u in uLaw is supposed to be a Greek letter mu but I can't be bothered to find the keyboard code for that right now)

  • Standard 8-bit speech uses a logarithmic encoding to increase signal/noise ratio I believe.

    So why not use 12 bit DAC and a 8-bit -> 12-bit lookup table at the receiving end?

    There are other ways to encode speech more compressed, but they rapidly get more complex.
  • lardomlardom Posts: 1,659
    edited 2015-12-29 19:43
    Mark_T, the list of things I have to learn has just gotten longer. I want to learn everything I can.
    I know almost nothing about processing audio so I have a lot to learn. I'm assuming 'logarithmic' is how digital samples are interpolated. This is my guess. I'll research the subject.
    A "12-bit lookup table" sounds like it 'needs' pasm to work. I'm guessing it will improve the quality of the transmitted audio. That is exciting.
    Do you know of any examples of this that I can study?
  • The telephony thing seems to be:
    https://en.wikipedia.org/wiki/G.711#.CE.BC-Law

    A look up table is an array, you don't need PASM. You can put 12 bit values in a 16- or 32- bit integer,
    there's plenty of room(!)

  • Mark_T wrote: »
    The telephony thing seems to be:
    https://en.wikipedia.org/wiki/G.711#.CE.BC-Law

    A look up table is an array, you don't need PASM. You can put 12 bit values in a 16- or 32- bit integer,
    there's plenty of room(!)
    G.711 is my first look at a compression codec. I will study it and learn how it works. Thank you.
  • lardom wrote: »
    I might have to learn pasm because sampling audio would require it.
    Digital audio is a brand new area for me. I don't mind quality lower than telephone voice. I just want to know if one byte carries enough data.
    I'll be satisfied with my original goal of wireless motor control but being able to speak through a wireless robot would just be 'over the top' fun.

    Sorry. just found some time to read your post. I agree with the observations that have been made but the question is, why do you need 8 bits and do you mean wirelessly "speak" live or recorded?

    If recorded then that's easy and after leveling and compression you could do 8-bits but since you need memory anyway it is just as easy to go to 16-bits and get some great sound quality. A microSD is as cheap as it gets and you can treat it as a raw SPI memory if you like although for me it is just as easy to access it as FAT32 16-bit 44kHz wave files. For playback you essentially double buffer where one cog fills the next buffer while another cog plays the samples at "exactly" the playback rate. You'd be surprised how unintelligible it becomes if that last aspect is not observed! I have also demonstrated reading the SD card live without double-buffering but at a lower sample rate to handle the latency of reading a new sector.

    If it's live then there is the problem with buffering packets as some packets will get lost or be retransmitted etc I would think although I should look to see what Mark_T has done then.

  • Peter Jakacki, I want to 'speak' wirelessly. I want the absolute minimum requirement to transmit intelligible speech over an 8-bit nrf24L01.
    I 'need' the challenge. Controlling motors with the transceiver is 'old hat'.
  • IIRRC The nRF24L01 can run at 250kbaud and 1Mbaud, the nRF24L01+ can go to 2Mbaud, there isn't
    a hard requirement to limit to 8 bit audio given the bandwidth available.
  • Mark_T wrote: »
    IIRRC The nRF24L01 can run at 250kbaud and 1Mbaud, the nRF24L01+ can go to 2Mbaud, there isn't
    a hard requirement to limit to 8 bit audio given the bandwidth available.
    My primary goal is to complete a wireless 3-wheeled 'Bot with an arm and a pan-tilt camera.
    I thought you already wrote an object that transmits audio. I'll study your work.
Sign In or Register to comment.