Buffer concept isn't working

Patrick1ab · 2010-05-11 23:56

Hi everyone!

I tried to play audio streams with my Propeller - VS1053 decoder chip - W5100 Ethernet configuration.
Due to the lack of a SPI SRAM chip (I hope to get the parcel this week), I had to find a workaround.

This first version is working, but I can't get a clean sound, as the internal 8K buffer of the W5100 is too small and reading only 32 bytes at a time will cause additional delays:

repeat while net.SocketTCPestablished(0)
   net.rxTCP(0, @musicbuffer, 32)
   repeat until ina[noparse][[/noparse]6]==1
   decoder.WriteDataBuffer(@musicbuffer)

In a next attempt I wanted to create a second buffer (8191 bytes) using the RAM of my Propeller chip. Filling it up seems to work, but my decoder chip doesn't recognize the audio data.

start:= true
repeat while net.SocketTCPestablished(0)
   if start == true                                'start condition: fill the whole buffer
     net.rxTCP(0, @receivebuffer, 8191)
     start:=false
     k:=0
   elseif k == 10                                  'add 320 bytes to the end of the buffer if k==10
     net.rxTCP(0, @byte[noparse][[/noparse]receivebuffer][noparse][[/noparse]7870], 320)
     k:=0
   bytemove(@musicbuffer, @byte[noparse][[/noparse]receivebuffer][noparse][[/noparse]0], 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
   repeat until ina[noparse][[/noparse]6]==1
   decoder.WriteDataBuffer(@musicbuffer)
   receivebuffer >>= 256                           'shift the "receivebuffer" by 32 bytes
   k+=1

Is it allowed to use bitwise shift on an array as I did above? Am I perhaps making a mistake (wrong order) joining the different parts of my buffer while adding the 320 bytes?

Harrison. · 2010-05-12 00:27

You'll probably want to use some sort of FIFO Queue. There are examples on how to implement them in driver_socket.spin (which is included in the internet radio contest submission). You can also use two alternating buffers, but that probably won't give you enough buffering space.

Other than that, you shouldn't have any problems playing streams with only a 8KB buffer. My first prototype didn't even use the SPI SRAM chips. You may also find that you'll need to tweak some TCP parameters to make it more responsive to packet loss.

Patrick1ab · 2010-05-12 00:58

Hi Harrison!

Thanks for your reply. I am going to take a look on the examples.

Harrison said...

Other than that, you shouldn't have any problems playing streams with only a 8KB buffer. My first prototype didn't even use the SPI SRAM chips. You may also find that you'll need to tweak some TCP parameters to make it more responsive to packet loss.

That's the disadvantage of the hardwired TCP stack. Tweaking TCP parameters is hardly possible.
I could change the timeout settings, but as far as I have seen, these are only used to determine whether a connection is still working or not.
There is also this "accept delayed ACK" setting, but I'm not sure if I should change this.

pullmoll · 2010-05-12 05:25

Patrick1ab said...
Is it allowed to use bitwise shift on an array as I did above? Am I perhaps making a mistake (wrong order) joining the different parts of my buffer while adding the 320 bytes?

What you want to accomplish by the receivebuffer >>= 256 isn't possible that way. You could longmove(receivebuffer, receivebuffer+32, (size-32)/4), but this is wasting a lot of CPU cycles. Instead you should add a pointer (or index) and copy from there:
bytemove(@musicbuffer, @receivebuffer + tailptr, 32)
tailptr += 32
You could also receive to @receivebuffer + headptr instead of of using k. headptr would always increase by the size you received. Instead of 320 you could use some value that doesn't lead to a split buffer, i.e. one that divides 8192 evenly, like 256.
Whenever headptr + 256 <> tailptr receive the next 256 bytes.
As long as the headptr step is bigger than the tailptr step, your buffer should always be filled to the limit.

HTH
Juergen

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Post Edited (pullmoll) : 5/12/2010 5:32:06 AM GMT

MagIO2 · 2010-05-12 06:08

I don't know why you have 2 buffers, the receivebuffer and the musicbuffer?! Is there overhead in the receivebuffer like the TCP/IP header or does it contain the pure data stream? If it's only the data stream you could divide the buffer into 2 sections. In case the player sent the whole data of one section to the decoder the loader can receive the next bunch of the data into this section while the player sends packets from the second section. In other words you alternately load data into section one and then section two, then section one again .....

The player has a pointer, so that it remembers which data to send next. If it moves from one section to the next it signals the loader that the previous section is ready for being overwritten with the next bunch of bytes. How that works has already been mentioned by pullmoll.

Just another comment to your code:
repeat until ina[noparse][[/noparse]6]==1
looks like you wait for a signal from the decoder. I'd do that with a waitpeq instruction, as it's much faster than the repeat until and has a deterministic timing. The repeat until timing will vary. If the pin is set to high just in front of the ina[noparse][[/noparse]6] instruction it will be much faster than it will be if the pin is set to high just after reading the pin status. Because then the SPIN code has to do the compare, jump back to the beginning of the loop and read the input again.

Patrick1ab · 2010-05-12 07:23

Hi!

Thanks for your replies.
I will try to change my buffer by adding pointers.

MagIO2 said...

I don't know why you have 2 buffers, the receivebuffer and the musicbuffer?! Is there overhead in the receivebuffer like the TCP/IP header or does it contain the pure data stream? If it's only the data stream you could divide the buffer into 2 sections. In case the player sent the whole data of one section to the decoder the loader can receive the next bunch of the data into this section while the player sends packets from the second section.

There is no overhead in this buffer. It's pure audio data.
I did this because everytime I get the "data request" signal on pin 6, I can only be sure that the next 32 bytes are fitting into the decoder chips internal buffer.
If I send more than 32 bytes, the signal on pin 6 could change to low meanwhile and the data is either ignored or parts which haven't been decoded yet are going to be overwritten.
Is there perhaps another solution (which isn't slower than my current one) to limit the amount of bytes read/sent from receivebuffer by 32?

MagIO2 · 2010-05-12 08:01

The high level design that I'd use in this case would be:

1 COG for reading the data from network
1 COG for sending the data in 32 byte packets to the decoder

both use the same buffer and are synchronized. The buffer size is twice as big as the paket-size you want to read from network.

Network COG
Read the first part of the buffer
set status first buffer loaded
loop
  wait for status "second part sent"
  load into second part of the buffer
  set status "second buffer loaded"
  wait for status "first part sent"
  load into first part of the buffer
  set status "first buffer loaded"

Send COG:
loop
  wait for "first buffer loaded"
  loop
    wait for signal from decoder
    send buffer content in 32 byte bunches
  set status "fisrt part sent"

  wait for "second buffer loaded"
  loop
    wait for signal from decoder
    send buffer content in 32 byte bunches
  set status "second part sent"

If you don't have network problems, the COG sending data to the decoder should never run out of data.

Patrick1ab · 2010-05-12 14:59

Hmm... I tried to modify my code using pointers, but it's still not working referring to the sound output I get.
Sometimes I can hear some music in there, then I get fragments (high beep tones), then a part of the music keeps repeating itself over and over again, afterwards again some music, and so on...

tailptr:=0
headptr:=0
...
if start == true                                'start condition: fill the whole buffer
  net.rxTCP(0, @receivebuffer, 8192)
  start:=false   
elseif headptr + 256 == tailptr                 
  net.rxTCP(0, @receivebuffer+headptr, 256)
  headptr+=256
  if headptr + 256 => 8192
    headptr:=0     
bytemove(@musicbuffer, @receivebuffer+tailptr, 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
waitpeq(|<6,|<6, 0)
decoder.WriteDataBuffer(@musicbuffer)
tailptr += 32                           'increment tail pointer by 32 bytes
if tailptr => 8192
  tailptr:=0

pullmoll · 2010-05-12 15:14

Patrick1ab said...
Hmm... I tried to modify my code using pointers, but it's still not working referring to the sound output I get.
Sometimes I can hear some music in there, then I get fragments (high beep tones), then a part of the music keeps repeating itself over and over again, afterwards again some music, and so on...
tailptr:=0
headptr:=0
...
if start == true                                'start condition: fill the whole buffer
  net.rxTCP(0, @receivebuffer, 8192)
  start:=false   
elseif headptr + 256 == tailptr                 
  net.rxTCP(0, @receivebuffer+headptr, 256)
  headptr+=256
  if headptr + 256 => 8192
    headptr:=0     
bytemove(@musicbuffer, @receivebuffer+tailptr, 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
waitpeq(|<6,|<6, 0)
decoder.WriteDataBuffer(@musicbuffer)
tailptr += 32                           'increment tail pointer by 32 bytes
if tailptr => 8192
  tailptr:=0

You should use headptr + 256 <> tailptr, because this is the "there is space in the buffer" condition
I wouldn't use a start condition to fill the entire buffer, just let the default fill do it. The buffer will never be 100% full because of the test with +256.
Instead of if xxxptr => 8192 ... you can just do xxxptr &= 8191 or, even easier to read, xxxptr := (xxxptr + <size>) & 8191. All in one expression.

HTH
Juergen

Edit: Here's how I would do it:

tailptr:=0
headptr:=0
...
if headptr + 256 <> tailptr                 
  net.rxTCP(0, @receivebuffer+headptr, 256)
  headptr := (headptr + 256) & 8191
if tailptr <> headptr
  bytemove(@musicbuffer, @receivebuffer+tailptr, 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
  tailptr := (tailptr + 32) & 8191                           'increment tail pointer by 32 bytes
waitpeq(|<6,|<6, 0)
decoder.WriteDataBuffer(@musicbuffer)

Writing the musicbuffer if there is no data in the receivebuff may be a bad idea. You could also bytefill(@musicbuffer, 0, 32) in an else branch, i.e. if tailptr == headptr. Then, if your receiver dies, you will hear silence.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Post Edited (pullmoll) : 5/12/2010 3:24:02 PM GMT

Patrick1ab · 2010-05-12 15:57

pullmoll said...

You could also bytefill(@musicbuffer, 0, 32) in an else branch, i.e. if tailptr == headptr. Then, if your receiver dies, you will hear silence.

Sending 32 bytes of zeros to the decoder chip doesn't necessarily have to be equal to silence

Since I've got a repeat loop surrounding this code, would this also work?

if tailptr == headptr
  next

pullmoll · 2010-05-12 16:55

Patrick1ab said...

pullmoll said...

You could also bytefill(@musicbuffer, 0, 32) in an else branch, i.e. if tailptr == headptr. Then, if your receiver dies, you will hear silence.

Sending 32 bytes of zeros to the decoder chip doesn't necessarily have to be equal to silence

Since I've got a repeat loop surrounding this code, would this also work?
if tailptr == headptr
  next

It should work, sure. I didn't know what kind of data you're streaming. Is it mpeg2 layer3, aka mp3? Then sending $ff should lead to silence IIRC.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Patrick1ab · 2010-05-12 17:30

Some streams will be mp3, some aac and I've even got one station streaming wma.

The sound output is still not working correctly. What worries me most is the repetition of song parts (like data is being sent more than once).
That cannot happen with this code, can it? Perhaps this is a fault of my W5100 driver...trying to read data even if it has already been read.
Maybe the address auto increment function I implemented has a bug (although it's working for transmission)... I'll switch it off and try again

Hmm... no, this is working fine and the problem still exists if I switch it off.

Post Edited (Patrick1ab) : 5/12/2010 5:35:57 PM GMT

pullmoll · 2010-05-12 17:53

Patrick1ab said...
The sound output is still not working correctly. What worries me most is the repetition of song parts (like data is being sent more than once).
That cannot happen with this code, can it? Perhaps this is a fault of my W5100 driver...trying to read data even if it has already been read.

I thought again about what I wrote and of course there is a major bug in there. The code would work for a step of 1, i.e. adding and retrieving bytes, but the buffer is overrun with the tests as they are, because the filling step doesn't always match the boundary of the (different sized) emptying step.

This means you'd have to switch to a slightly different method: filling level.

tailptr:=0
headptr:=0
filled:=0
...
if filled + 256 < 8192
  net.rxTCP(0, @receivebuffer+headptr, 256)
  headptr := (headptr + 256) & 8191
  filled += 256
if filled => 32
  bytemove(@musicbuffer, @receivebuffer+tailptr, 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
  tailptr := (tailptr + 32) & 8191                           'increment tail pointer by 32 bytes
  filled -= 32
waitpeq(|<6,|<6, 0)
decoder.WriteDataBuffer(@musicbuffer)

Now this should really work, while I have no way to verify the code.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Patrick1ab · 2010-05-12 18:23

Still no change... I really start to believe that this is a problem of the ethernet driver I'm using.

In the attachments there are some examples of the audio output.

MagIO2 · 2010-05-12 19:16

32 bytes is not very much data, so I assume that your loop is simply to slow to deliver the data fast enough. Have a look at my second post. You should run the stuff in 2 COGs. This way the buffer will always have enough data for feeding the decoder.

Patrick1ab · 2010-05-12 20:41

MagIO2 said...
32 bytes is not very much data, so I assume that your loop is simply to slow to deliver the data fast enough. Have a look at my second post. You should run the stuff in 2 COGs. This way the buffer will always have enough data for feeding the decoder.

I can try that too, but I'm 90 % sure that this won't solve the problem, as a buffer underrun "sounds" different.
Please compare the sound example of a buffer underrun (which I created by adding a 500us delay to my existing spin code) below with the ones I posted earlier.

I should also mention that the file I used to create the buffer underrun has a peak bitrate of 340 kb/s at the beginning. When the voice is becoming a bit clearer it still has around 200 kb/s.
It depends on the stream, but Internet streams usually have only 128 kb/s.

Post Edited (Patrick1ab) : 5/12/2010 8:50:08 PM GMT

pullmoll · 2010-05-12 20:43

Patrick1ab said...
Still no change... I really start to believe that this is a problem of the ethernet driver I'm using.

In the attachments there are some examples of the audio output.

That sounds as if the same buffer was played several times. I'm not sure what happens if your decoder.WriteDataBuffer isn't served fast enough. You could try to get rid of the musicbuffer and call decoder.WriteDataBuffer(@receivebuffer+tailptr) as soon as the waitpeq returns. Perhaps you could try to let 2 cogs do the filling and playing of the buffer, so that reading from TCP doesn't block the playing of the next 32 byte chunk for too long.

Otherwise I'm out of ideas. The last resort would be trying to achieve the same thing in PASM, which should be fast enough. I only don't know if your decoder is easy to interface from assembler.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Patrick1ab · 2010-05-12 20:58

pullmoll said...
Otherwise I'm out of ideas.

Perhaps I've got another one, although I think it's very unlikely:

Could it be that the internal buffer of the ethernet chip isn't read fast enough?
Maybe the chip sends a "retransmit" to the streaming server several times and this causes duplicate or fragmented data in the buffer?

MagIO2 · 2010-05-12 22:40

That's why I suggest to use 2 COGs ... it gives the network COG all the time to get the next bunch of bytes until half of the buffer has been send by the decoder driver. The decoder driver on the other hand will usually never run out of data, as it's much slower than the network transfer (10Mbit against 128kbit). So synchronization in one direction is only needed to harden the code against network problems.

I don't know the network module and it's drivers, but maybe there are some parameters you could optimize as well. For example there is a ?MTU? (?maximum transfer unit? - if I remember right) which tells the network interface what the size of the payload of a TCP/IP package can be at max. Maybe it's wise to request that number of bytes whenever you do an rxTCP.

MagIO2 · 2010-05-12 22:49

By the way ... what's the transfer rate between W5100 and the propeller?

Patrick1ab · 2010-05-13 00:11

Maximum transfer rate while receiving should be around 1750 kByte/s for the PASM part @ 80MHz clock speed.
However, as you have to call the rxTCP function again after each 8K (and do some address calculations), I would say you are able to reach 440 kBytes/s.
I haven't tested this yet. I only did some tests for the data transmission, which has 2-3 PASM instructions less:
I was able to download data from my Propeller Webserver at 510 kBytes/s. This is enough bandwidth to stream DivX video contents.

Patrick1ab · 2010-06-01 23:40

Ouch! I'm so stupid.

After months of searching the bug in my streaming program (music with hiccups and repetition all the time), I finally found it:

It is indeed a buffer underrun, but not in the propeller chip.
It occurs in the W5100 chip itself, as the propeller code is much quicker than the data rate / bit rate of the stream.
(for example: a stream with 96 kBit/s would only need 48 loop cycles per second while reading 256 Bytes at a time)

This is what happens:

- The line

net.rxTCP(0, @receivebuffer+headptr, 256)

tries to load 256 bytes from the rx buffer of the W5100, but there is no guaranty that there are 256 bytes to be read.
- Assuming that we are always reading 256 bytes, the pointer is incremented (even if the number of bytes read is less)

headptr := (headptr + 256) & 8191
filled += 256

So parts of the "old data" are accidentally declared as "new data".
In the worst case (due to a temporary loss of internet bandwidth), there are several parts in a row which are not being updated, but declared as "new data".
That's when the music keeps repeating over and over again.

To solve this I have to check the actual number of bytes received

len:=net.rxTCP(0, @receivebuffer+headptr, 256)

and then increment the pointers / variables according to this length.

Unfortunately this first attempt does not work:

repeat while net.SocketTCPestablished(0)

     if filled + 256 < 8192                 
        len:=net.rxTCP(0, @receivebuffer+headptr, 256)
        headptr := (headptr + len) & 8191
        filled += len
     if filled => 32
        bytemove(@musicbuffer, @receivebuffer+tailptr, 32) 'move the lowest 32 bytes of "receivebuffer" to "musicbuffer"
        tailptr := (tailptr + 32) & 8191                           'increment tail pointer by 32 bytes
        filled -= 32
     else
        next
     waitpeq(|<6,|<6, 0)
     decoder.WriteDataBuffer(@musicbuffer)

It stops after a second of playback.

Timmoore · 2010-06-02 00:04

What happens when headptr is less than 256 less than 8192, ie rxtcp received < 256, you incremented headptr to within 256 of 8192 and then rxtcp next time reads 256 bytes?

Patrick1ab · 2010-06-02 07:18

Sorry, I don't understand your question. Is it about the previous version or about the last changes I made?

Timmoore · 2010-06-02 15:51

In your latest code

if filled + 256 < 8192                 
        len:=net.rxTCP(0, @receivebuffer+headptr, 256)
        headptr := (headptr + len) & 8191
        filled += len

·If headptr is just more the 256 from 8192 - i.e. end of the buffer and you get < 256 bytes in then headptr is less than 256 from the end of hte buffer. If next time you get 256 bytes from rxTCP you wil run past the end of the buffer overwriting whatevers there.
e.g. if headptr is 7930, then rxTCP gives you 50 bytes so headptr is 7980, then rxtcp gives you·256 bytes, rxTCP will write to 8236
The following code limits the rxtcp length to the remaining in the buffer or 256 if its longer than that

if filled + 256 < 8192                 
        len:=net.rxTCP(0, @receivebuffer+headptr, (8192-headptr)<#256)
        headptr := (headptr + len) & 8191
        filled += len

Patrick1ab · 2010-06-02 15:56

Timmoore said...
If next time you get 256 bytes from rxTCP you wil run past the end of the buffer overwriting whatevers there.
e.g. if headptr is 7930, then rxTCP gives you 50 bytes so headptr is 7980, then rxtcp gives you 256 bytes, rxTCP will write to 8236

Okay, now I understand what you mean.

Thanks!

Buffer concept isn't working

Comments