How many counters and adders are there in each smart pin?
I'm not arguing in favour of Tukey at the expense of anything else. From the output of a binary counter, only a little logic would be needed to generate a small error term, which when added to the counter would create the Tukey value to add to the accumulator. One counter and two adders needed and Sinc3 requires two adders.
What would be a big improvement is more bits of resolution within the same time period. We need to understand that.
If that is not possible, the Tukey is the best approach. I just don't have my head around the more-bits thing yet. Do you understand it?
How many counters and adders are there in each smart pin?
I'm not arguing in favour of Tukey at the expense of anything else. From the output of a binary counter, only a little logic would be needed to generate a small error term, which when added to the counter would create the Tukey value to add to the accumulator. One counter and two adders needed and Sinc3 requires two adders.
@ErNa
Here's some real streams from P2 silicon.
8 channels of bitstream x 64000 clocks
That's just what I wanted to get my hands on. Thank you!
The different colors are different ADC channels. Gio and Vio are plotted together because it should only be the DC value that is different.
TonyB_, I see now that the Tukey filter is the best when you want to "open the valve" in the shortest number of samples, but if shortening the time is not critical, a triangle filter is cheap. Any trapezoid window could be improved by a Tukey window in the same place, but Tukey256 would be very expensive compared to Trap256.
A triangular window has a gain of 0.5, whereas a Tukey window could be close to 1.0. What does that the mean for the effective number of bits?
How is the gain pushing 2x? Double gain adds 1 bit, right?
You lose a bit of resolution with a triangular window compared to a near rectangular one, but if you've removed more noise then it would be worth it. You need to scale the outputs from the triangular and Tukey windows so they match, then compare the noise.
Can that outer loop run for infinite iterations without the need to reset the acc's and diff's?
Yes, the accumulators can run forever, they wrap around. But you don't want it to wrap around more than once between reads, because you won't know how many time it happened.
https://dspguru.com/files/cic.pdf See section 4 about bit growth. R is the decimation factor. So reading the accumulator less frequently will require it to have more bits.
If you resume after an extended break, or after changing channels, or changing sample rates you'll need to cycle the outer loop a few times to flush out the old diff's.
If you resume after an extended break, or after changing channels, or changing sample rates you'll need to cycle the outer loop a few times to flush out the old diff's.
Right, the reset isn't even needed. But if it's in software then easy.
The filter size thing messed with my head until I had data out. You'll note I'm unnecessarily resetting the accumulators too.
Also consider the binomial filter. The coefficients are the odd rows of Pascal's triangle, and they always add up to a power of two. Its frequency response is monotonic without "bumps" at the higher frequencies.
The limit for sinc4 is about 180 clocks. So about 1.4MSPS minimum. To have to do 4 differentiations at that rate wouldn't leave much time to do anything with the result.
Sinc3 outputting at 250ksps seems like a good rate for audio. Just add an anti-aliasing filter that reduces the rate to about 50ksps.
I wonder if it's possible to add a 4th stage in software, running at the reduced rate.
The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
The limit for sinc4 is about 180 clocks. So about 1.4MSPS minimum. To have to do 4 differentiations at that rate wouldn't leave much time to do anything with the result.
Sinc3 outputting at 250ksps seems like a good rate for audio. Just add an anti-aliasing filter that reduces the rate to about 50ksps.
I wonder if it's possible to add a 4th stage in software, running at the reduced rate.
The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
So, for 1024 clocks:
ACC1 = 11 bits
ACC2 = 21 bits
ACC3 = 32 bits
The time base and clock counter can work in the X register for automatic ACC3 reporting. WXPIN sets up the number of clock cycles for ACC3 reporting.
ACC1 and ACC2 can work in the Y register.
ACC3 can work in the Z register, which always reports the result via RDPIN, anyway.
Also consider the binomial filter. The coefficients are the odd rows of Pascal's triangle, and they always add up to a power of two. Its frequency response is monotonic without "bumps" at the higher frequencies.
-Phil
Very interesting, Phil. Thanks!
The filtering performance should be about the same as sinc3 or sinc4 for the same length. But it's a different way to implement it.
...The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
Wouldn't it still be helpful to have the smart pin report ACC3 on a routine basis? Then we'll have plenty of time to pick up the sample and process it. Plus, no information will be lost in ACC3.
The time base and clock counter can work in the X register for automatic ACC3 reporting. WXPIN sets up the number of clock cycles for ACC3 reporting.
ACC1 and ACC2 can work in the Y register.
ACC3 can work in the Z register, which always reports the result via RDPIN, anyway.
All 3 ACCs might need to be 32 bits internally. But if we design for a maximum read interval of 1024 clocks, then we would only need to report the specified LSBs from ACC1 and ACC2. If we had the full ACC2 we could run longer between reads, but with less filtering. The extra length might make up for it.
...The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
Wouldn't it still be helpful to have the smart pin report ACC3 on a routine basis? Then we'll have plenty of time to pick up the sample and process it. Plus, no information will be lost in ACC3.
Yes. I'm still not quite sure what all the smart pins can do. Automatically collecting ACC3 would be good.
I guess if we were using the Tukey or trapezoid window, the smart pin would know when the report was going to happen and ramp the window down at the right time.
The smart pin could do something like this:
wait for report timer
ramp window down
read and clear accumulator
send report
ramp window up
repeat
If we dedicate X, Y, and Z registers to 32-bit accumulator functions, we still have 16 bits left in some extra flops to track time and report ACC3 routinely.
So, ideally, all ACC's should be 32 bits and rolling over with no cares, as long as we sample ACC3 before it rolls over?
If we have all 32-bit accumulators, it keeps the software simpler, right?
It seems to me that we can do 2954 samples before overflowing ACC3 at 32 bits.
On my phone:
a = 0
b = 0
c = 0
for x = 1 to 2954
c = c+b
b = b+a
a = a+1
next x
print a,b,c
print (log10(a) / log10(2))
print (log10(b) / log10(2))
print (log10(c) / log10(2))
...The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
Wouldn't it still be helpful to have the smart pin report ACC3 on a routine basis? Then we'll have plenty of time to pick up the sample and process it. Plus, no information will be lost in ACC3.
Yes. I'm still not quite sure what all the smart pins can do. Automatically collecting ACC3 would be good.
I guess if we were using the Tukey or trapezoid window, the smart pin would know when the report was going to happen and ramp the window down at the right time.
The smart pin could do something like this:
wait for report timer
ramp window down
read and clear accumulator
send report
ramp window up
repeat
But if we run SINC3 continuously, we won't need to do any windowing, right?
But if we run SINC3 continuously, we won't need to do any windowing, right?
Yep.
I think James is thinking about how to do single discontinuous samples for instrumentation calibrating.
As far as I can tell the ramping is built into the filters already. But a filter adds lag too, this is where software can do a better job by selectively ignoring the leading samples. Just need the accumulators in hardware.
But if we run SINC3 continuously, we won't need to do any windowing, right?
Yep.
I think James is thinking about how to do single discontinuous samples for instrumentation calibrating.
As far as I can tell the ramping is built into the filters already. But a filter adds lag too, this is where software can do a better job by selectively ignoring the leading samples. Just need the accumulators in hardware.
Couldn't we do instrumentation calibration by waiting for the 4th sample?
If we have all 32-bit accumulators, it keeps the software simpler, right?
It seems to me that we can do 2954 samples before overflowing ACC3 at 32 bits.
Not quite. We have to worry about the output stage too. Although if we did extended math on the output side it might be fine.
This thing has a gain of G=(RM)^N. N=3 for three accumulators. So our maximum RM is 1625. The reason is 1625^3 < 2^32 -1. If we tried to go to 1626, the output would overflow. 1626^3 > 2^32 -1. In practice we could usually go a little more because we wouldn't see solid ones unless the input was above the supply.
...The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
Wouldn't it still be helpful to have the smart pin report ACC3 on a routine basis? Then we'll have plenty of time to pick up the sample and process it. Plus, no information will be lost in ACC3.
Yes. I'm still not quite sure what all the smart pins can do. Automatically collecting ACC3 would be good.
I guess if we were using the Tukey or trapezoid window, the smart pin would know when the report was going to happen and ramp the window down at the right time.
The smart pin could do something like this:
wait for report timer
ramp window down
read and clear accumulator
send report
ramp window up
repeat
But if we run SINC3 continuously, we won't need to do any windowing, right?
As I understand, X and Y are write only? What would be the purpose of using them for the accumulators? We only need to read one accumulator, it would be fine to mux them. In case somebody wanted first or second order value for some reason. I would prefer to loose access to part of the accumulators than loose timing accuracy.
All 3 ACCs might need to be 32 bits internally. But if we design for a maximum read interval of 1024 clocks, then we would only need to report the specified LSBs from ACC1 and ACC2. If we had the full ACC2 we could run longer between reads, but with less filtering. The extra length might make up for it.
I'm pretty certain Chip will make these as separate smartpin modes. So when set for Sinc2 the second accumulator will become full 32-bit.
It's not reconfigurable logic like a FPGA but the optimiser will reuse flops in particular.
Comments
What would be a big improvement is more bits of resolution within the same time period. We need to understand that.
If that is not possible, the Tukey is the best approach. I just don't have my head around the more-bits thing yet. Do you understand it?
I guess there is an effective interpolation also happening at the same time in the higher order Sinc filters.
PS: Thank you hugely for the spectral graphs!
There's no fixed set. It can be whatever we want.
That's just what I wanted to get my hands on. Thank you!
The different colors are different ADC channels. Gio and Vio are plotted together because it should only be the DC value that is different.
-James
So, you reset all the accumulators and differentiators, only if it's being reset.
You run some number of ADC bits into it, updating your accumulators.
After so many bits, you get a reading and update your differentiators.
Can that outer loop run for infinite iterations without the need to reset the acc's and diff's?
You lose a bit of resolution with a triangular window compared to a near rectangular one, but if you've removed more noise then it would be worth it. You need to scale the outputs from the triangular and Tukey windows so they match, then compare the noise.
Yes, the accumulators can run forever, they wrap around. But you don't want it to wrap around more than once between reads, because you won't know how many time it happened.
https://dspguru.com/files/cic.pdf See section 4 about bit growth. R is the decimation factor. So reading the accumulator less frequently will require it to have more bits.
If you resume after an extended break, or after changing channels, or changing sample rates you'll need to cycle the outer loop a few times to flush out the old diff's.
-James
Absolutely. That's what I call rolling or continuous.
Right, the reset isn't even needed. But if it's in software then easy.
The filter size thing messed with my head until I had data out. You'll note I'm unnecessarily resetting the accumulators too.
I can see that sinc3acc3 would get huge fast.
What kind of read interval were you anticipating, Evanh? 256 clocks? 4096 clocks?
At a read interval, you would need to read all three accumulators and maintain the diff's in software, right?
I'm imagining the diffs in software to maintain existing gate count.
Correct for both.
[edited a hundred times ]
Sinc3 produces 30-bit samples in 1024 clocks. So that's its limit.
Sinc2 produces 20-bit samples in 1024 clocks and 32-bit samples takes 64k clocks.
-Phil
Sinc3 outputting at 250ksps seems like a good rate for audio. Just add an anti-aliasing filter that reduces the rate to about 50ksps.
I wonder if it's possible to add a 4th stage in software, running at the reduced rate.
The good part about the sinc filter is that there is no counter to synchronize with the ADC reading process. If we generated the window in time domain, we would have to collect the output while the window is zero, to avoid edge discontinuities.
So, for 1024 clocks:
ACC1 = 11 bits
ACC2 = 21 bits
ACC3 = 32 bits
The time base and clock counter can work in the X register for automatic ACC3 reporting. WXPIN sets up the number of clock cycles for ACC3 reporting.
ACC1 and ACC2 can work in the Y register.
ACC3 can work in the Z register, which always reports the result via RDPIN, anyway.
Very interesting, Phil. Thanks!
The filtering performance should be about the same as sinc3 or sinc4 for the same length. But it's a different way to implement it.
-James
Wouldn't it still be helpful to have the smart pin report ACC3 on a routine basis? Then we'll have plenty of time to pick up the sample and process it. Plus, no information will be lost in ACC3.
All 3 ACCs might need to be 32 bits internally. But if we design for a maximum read interval of 1024 clocks, then we would only need to report the specified LSBs from ACC1 and ACC2. If we had the full ACC2 we could run longer between reads, but with less filtering. The extra length might make up for it.
-James
Yes. I'm still not quite sure what all the smart pins can do. Automatically collecting ACC3 would be good.
I guess if we were using the Tukey or trapezoid window, the smart pin would know when the report was going to happen and ramp the window down at the right time.
The smart pin could do something like this:
wait for report timer
ramp window down
read and clear accumulator
send report
ramp window up
repeat
So, ideally, all ACC's should be 32 bits and rolling over with no cares, as long as we sample ACC3 before it rolls over?
It seems to me that we can do 2954 samples before overflowing ACC3 at 32 bits.
On my phone:
But if we run SINC3 continuously, we won't need to do any windowing, right?
Yep.
I think James is thinking about how to do single discontinuous samples for instrumentation calibrating.
As far as I can tell the ramping is built into the filters already. But a filter adds lag too, this is where software can do a better job by selectively ignoring the leading samples. Just need the accumulators in hardware.
Couldn't we do instrumentation calibration by waiting for the 4th sample?
This thing has a gain of G=(RM)^N. N=3 for three accumulators. So our maximum RM is 1625. The reason is 1625^3 < 2^32 -1. If we tried to go to 1626, the output would overflow. 1626^3 > 2^32 -1. In practice we could usually go a little more because we wouldn't see solid ones unless the input was above the supply.
It also turn out that Bout bits are needed for each integrator and comb stage. (https://dspguru.com/files/cic.pdf)
Right, just read the data.
So it looks like 2 values to flush out.
As I understand, X and Y are write only? What would be the purpose of using them for the accumulators? We only need to read one accumulator, it would be fine to mux them. In case somebody wanted first or second order value for some reason. I would prefer to loose access to part of the accumulators than loose timing accuracy.
Can you actually make savings though? How is the existing +1 summing done?
Yes, that combats the lag.
I'm pretty certain Chip will make these as separate smartpin modes. So when set for Sinc2 the second accumulator will become full 32-bit.
It's not reconfigurable logic like a FPGA but the optimiser will reuse flops in particular.