TonyB_, i think I will implement your filters. They take care of the eight bit issue nicely. For highest speed, they are necessary.
I wish we had infinite logic, so that we could also implement a 16-bit converter. Then, the streamer could write 16-bit samples to memory whenever it wanted to, as well.
Thanks, Chip. I don't mind if other filters are used, whatever can fit. I counted how many register bits in total were in use the other day for the scope mode and it came to 32x3 + 1x16 = 112. I'm not sure whether that is the limit and the 80 taps you need for one of your cascaded filters are impossible or not. If more bits are available then I could have a look at slightly larger windows that might avoid the need for clamping one of them. Having said that, Hann/Tukey and Blackman might not go together so well as now (see pic below). I've read some good things about the Blackman function, which is a combination of cosine-squared and cosine-squared-squared.
Yes, we have 3x32 + 16 flops available in the smart pin. That's 112. Everything keeps fitting perfectly.
TonyB_, I'm thinking maybe we could have an additional mode where those 17 bits that are used for triggering get reused for a smoothing filter, so that we can get 12-bit samples out per clock. It could feed from the 68-tap Tukey.
What is the total sum value for the Tukey68? Its top bits are 255, but there are 2-3 lower bits, as well, right?
TonyB_,, I'm thinking maybe we could have an additional mode where those 17 bits that are used for triggering get reused for a smoothing filter, so that we can get 12-bit samples out per clock. It could feed from the 68-tap Tukey.
What is the total sum value for the Tukey68? Its top bits are 255, but there are 2-3 lower bits, as well, right?
TonyB_,, I'm thinking maybe we could have an additional mode where those 17 bits that are used for triggering get reused for a smoothing filter, so that we can get 12-bit samples out per clock. It could feed from the 68-tap Tukey.
What is the total sum value for the Tukey68? Its top bits are 255, but there are 2-3 lower bits, as well, right?
TonyB_, for an RC-type smoothing filter, it would be ideal to have something that goes to the maximum value for its number of bits, like a sum of $7FF, $3FF, $1FF, or $FF. Any trick we can employ to get Tukey68 to sum to $7FF, instead of $7F8?
We could get there with 'sum[10:0] + sum[10:8]. Maybe that could be part of the smoothing filter.
Chip, I'll look into it now. A long Tukey-type window is definitely the best way to get more than an eight-bit output every clock as I don't think we have the logic for an extra filter.
Here are the three four windows for the BASIC program above:
Tukey 21/40 "Tukey68"
1, 2, 4, 6, 8,11,14,17,20,23,26,29,32,34,36,38,39 Ramp up/down = 340 x 2
40[34] Plateau = 1360
39,38,36,34,32,29,26,23,20,17,14,11, 8, 6, 4, 2, 1 Grand total = 2040, /8 = 255
Tukey 19/34 "Tukey45"
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up/down = 255 x 2
34[15] Plateau = 510
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Grand total = 1020, /4 = 255
Hann 19/34 "Hann30"
1, 2, 4, 6, 8,11,14,17,20,23,26,28,30,32,33 Ramp up/down = 255 x 2
33,32,30,28,26,23,20,17,14,11, 8, 6, 4, 2, 1 Grand total = 510, /2 = 255
Blackman 15/24 "Blackman22"
1, 2, 4, 6, 8,11,14,17,20,22,23 Ramp up/down = 128 x 2
23,22,20,17,14,11, 8, 6, 4, 2, 1 Grand total = 256 *** clamp to 255 ***
I'm curious to know how these windows affect the rise and fall times of square waves. The new Blackman window above is the shortest and thus should have the fastest response.
Chip, I'll look into it now. A long Tukey-type window is definitely the best way to get more than an eight-bit output every clock as I don't think we have the logic for an extra filter.
Thinking more, we just need a sum that has 12 contiguous 1's in its top bits.
Is that essential, given that’s well outside the rails on the ADC ?
I guess $FFF is useful for easy clipping sense on the higher gain ranges, but 2040 could also be tested for ?
Chip, I'll look into it now. A long Tukey-type window is definitely the best way to get more than an eight-bit output every clock as I don't think we have the logic for an extra filter.
The main issue with the Tukey/Hann/Blackman windows is their max values are only about half of the cascaded moving averages (CMA for short).
As an alternative, with fairly minor tap changes and a little bit of skipping, the long CMA can fit into 69 of the 70 available taps including zeroes, as follows:
But those aren't really cascaded moving average filters, because they don't have any moving averages, right?
From The Scientist and Engineer's Guide to Digital Signal Processing:
As the name implies, the moving average filter operates by averaging a number of points from the input signal to produce each point in the output signal... Multiple-pass moving average filters involve passing the input signal through a moving average filter two or more times.
But those filters I made just keep summing error terms, not tracking some specific number of samples.
As best as I can tell, the response is the same. I think the triple integrated implementation is more efficient than literally using moving average filters. The reason for that is bit growth. So the first filter needs one bit of memory per sample. Assuming we track the delta of the sum instead of the absolute sum, we get a 2 bit delta output from the first filter. (The moving average sum can go up or down by at most 1 each clock cycle.) But then we need 2 bits per sample for the next filter. And 3 bits per sample for the final filter. So we use a bit more memory, and still need to integrate the output 3 times.
One could see error values as deltas. Summing those is essentially an average centered on some value, is it not?
That is how I see it. It acts just like an RC filter. I'm trying to figure out how to introduce the inductors, as well, to get better performance, more like a pi filter.
Does anyone have any better ideas than the filter I implemented to do 16-bit conversions? It was a very simple notion and it works well, but there may be some better techniques. Maybe something else would take less logic, or have better pass-band characteristics, or settle faster.
Being able to grab a 16-bit conversion whenever you want it is really nice.
Maybe the scope filter could substitute for some of the stages.
Does anyone have any better ideas than the filter I implemented to do 16-bit conversions? It was a very simple notion and it works well, but there may be some better techniques. Maybe something else would take less logic, or have better pass-band characteristics, or settle faster.
Being able to grab a 16-bit conversion whenever you want it is really nice.
Maybe the scope filter could substitute for some of the stages.
Yes! I think it would be good to use the longest Tukey as the input to the recursive averaging filter.
That is how I see it. It acts just like an RC filter.
Totally. Once you all broke it down, I could see that and thought it very clever. It's very interesting how math can work sometimes.
We have lots of higher order functions that get us places in a few, or even one step. This kind of approach, iterating and accumulating, takes many steps. Seems like old world ancients type methods brought forward into our modern age do make sense. It's just an application problem. We've still got all the old ways, but we do not always see how they can be used today. And that's expected, because we also have the modern means and methods. Few people go looking, or at least it seems to me that way.
But, at this level, where it's silicon, and fast and simple can make sense, doing that looking also makes sense. Honestly, this whole discussion has been very illuminating to me personally. Good stuff.
Re: Inductor.
Sure seems like it's going to be a subtraction, out of phase, and out of time somehow. Very hard to picture. May also be selective skips too. Like the absence of an operation when the inductor would assert itself in the process. Something like that. There isn't really anything else, other than maybe a parallel stream of bits, to represent that domain, to be combined at the accumulator.
Does anyone have any better ideas than the filter I implemented to do 16-bit conversions? It was a very simple notion and it works well, but there may be some better techniques. Maybe something else would take less logic, or have better pass-band characteristics, or settle faster.
Being able to grab a 16-bit conversion whenever you want it is really nice.
Maybe the scope filter could substitute for some of the stages.
Yes! I think it would be good to use the longest Tukey as the input to the recursive averaging filter.
Now it's my turn. Chip, what is this recursive averaging filter?
I haven't been talking about the 4th-order 22-bit filter that returns a 16-bit sample every clock. When I saw it was 50 ALMs bigger than the windowed filters I moved on.
cmafilter(x,y,z) is a cascaded moving average. It means filter by a moving average of length x. Then filter the output of that by a moving average filter of length y. Do it again with a moving average of length z. If all 3 numbers were the same it would be a sinc3 filter. This is slightly more general. We basically have a sinc3 with the center part lengthened.
A short name is handy and, for want of anything better, I'm calling the new window functions CMAx as follows:
Below is list of CMAx and corresponding Tukey/Hann/Blackman for the four windows available in scope mode. I've doubled the Tukey and Hann values for easier comparison.
Skipping allows for alternatives to CMA22 that sum to 256 or 512.
CMA67 has not been tested yet. It will give us a 12-bit sample on every clock and one more config bit compared to Tukey68. I think separate graphs for each of these four pairs above would be helpful because it's a straight choice between one set or the other.
Comments
Yes, we have 3x32 + 16 flops available in the smart pin. That's 112. Everything keeps fitting perfectly.
What is the total sum value for the Tukey68? Its top bits are 255, but there are 2-3 lower bits, as well, right?
Output ranges:
Tukey68 = 0-2040
Tukey45 = 0-1020
Hann30 = 0-510
Blackman22 = 0-256
Did you have to re-allocate some bits for your ~80-point cascaded window?
I used the trigger bits. I just wanted to see what 80 taps looked like. 68 taps are pretty good, actually.
We could get there with 'sum[10:0] + sum[10:8]. Maybe that could be part of the smoothing filter.
We can repurpose the 17 trigger bits:
error[11:0] = input[11:0] - filter[16:5]
filter[16:0] = filter[16:0] + {{5{error[11]}}, error[11:0]}
sample[11:0] = filter[16:5]
I guess $FFF is useful for easy clipping sense on the higher gain ranges, but 2040 could also be tested for ?
The main issue with the Tukey/Hann/Blackman windows is their max values are only about half of the cascaded moving averages (CMA for short).
As an alternative, with fairly minor tap changes and a little bit of skipping, the long CMA can fit into 69 of the 70 available taps including zeroes, as follows:
What would have been 81 is skipped twice. I've shown 80 as part of the ramps for clarity (I hope). Some skipping is needed for the other windows, too.
Your new triple-integrated filters as described by Saucy:
http://forums.parallax.com/discussion/comment/1457960/#Comment_1457960
From The Scientist and Engineer's Guide to Digital Signal Processing:
As best as I can tell, the response is the same. I think the triple integrated implementation is more efficient than literally using moving average filters. The reason for that is bit growth. So the first filter needs one bit of memory per sample. Assuming we track the delta of the sum instead of the absolute sum, we get a 2 bit delta output from the first filter. (The moving average sum can go up or down by at most 1 each clock cycle.) But then we need 2 bits per sample for the next filter. And 3 bits per sample for the final filter. So we use a bit more memory, and still need to integrate the output 3 times.
That is how I see it. It acts just like an RC filter. I'm trying to figure out how to introduce the inductors, as well, to get better performance, more like a pi filter.
Maybe the scope filter could substitute for some of the stages.
Yes! I think it would be good to use the longest Tukey as the input to the recursive averaging filter.
Totally. Once you all broke it down, I could see that and thought it very clever. It's very interesting how math can work sometimes.
We have lots of higher order functions that get us places in a few, or even one step. This kind of approach, iterating and accumulating, takes many steps. Seems like old world ancients type methods brought forward into our modern age do make sense. It's just an application problem. We've still got all the old ways, but we do not always see how they can be used today. And that's expected, because we also have the modern means and methods. Few people go looking, or at least it seems to me that way.
But, at this level, where it's silicon, and fast and simple can make sense, doing that looking also makes sense. Honestly, this whole discussion has been very illuminating to me personally. Good stuff.
Re: Inductor.
Sure seems like it's going to be a subtraction, out of phase, and out of time somehow. Very hard to picture. May also be selective skips too. Like the absence of an operation when the inductor would assert itself in the process. Something like that. There isn't really anything else, other than maybe a parallel stream of bits, to represent that domain, to be combined at the accumulator.
Now it's my turn. Chip, what is this recursive averaging filter?
I haven't been talking about the 4th-order 22-bit filter that returns a 16-bit sample every clock. When I saw it was 50 ALMs bigger than the windowed filters I moved on.
A short name is handy and, for want of anything better, I'm calling the new window functions CMAx as follows:
cmafilter(4,4,8) = thirdorderfilter(4,0) = CMA14
cmafilter(4,4,16) = thirdorderfilter(4,8) = CMA22
cmafilter(8,8,16) = thirdorderfilter(8,0) = CMA30
cmafilter(8,8,32) = thirdorderfilter(8,16) = CMA46
cmafilter(8,8,64) = thirdorderfilter(8,48) = CMA78
As CMA78 is too big to fit, I created a new CMA67 to replace it earlier today:
http://forums.parallax.com/discussion/comment/1458014/#Comment_1458014
Below is list of CMAx and corresponding Tukey/Hann/Blackman for the four windows available in scope mode. I've doubled the Tukey and Hann values for easier comparison.
Skipping allows for alternatives to CMA22 that sum to 256 or 512.
CMA67 has not been tested yet. It will give us a 12-bit sample on every clock and one more config bit compared to Tukey68. I think separate graphs for each of these four pairs above would be helpful because it's a straight choice between one set or the other.