I would not call this a filter. It is in fact a window function with a trapezoid window, like used for FFT. This reducies the leakage effect and gives therefore more accurate results.
The integrator is still linear and not exponential like in classical filters.
A triangular window looks to be straightforward to implement. The code uses 512 longs of data and if some datasets were posted we could try different windows ourselves.
The first and last samples are like the rest of the data, but their position is significant, as there is continuity in the bitstream. For some reason, fading them in at first and out at last cuts the signal noise down so that the ADC operation becomes quite ideal.
This windowing operation is identical to oversampling with (almost entirely) the same data, but at 32 different sequential bit offsets. It's the same thing, just done with efficiency by exploiting commonalities.
I've done a lot of testing on this tonight, even making a closed loop to adjust the sample length, in order to drive the 1's delta between VIO and GIO measurements to constants like 255, in order to make mathless conversions. Didn't work well, due to frequent sample-length adjustments.
I think all we need is a smart pin mode to make windowed measurements for ADC sampling. That is not much logic. We can't automate ADC operation beyond the windowed measurement, though, because it gets too complex, too quickly. This means that the streamer won't have the planned ADC modes, because it would need to make windowed measurements for them to be of acceptable quality and that would over-complicate the streamer.
I'm really glad we've got the ADC working as it should, now. That Trumps a lot of the other stuff. ErNa?
Ramps will be fixed at 32 clocks each, plateau will be (X - 64 clocks).
I think so. We could go more or less than 32, but 32 seems like a happy medium that doesn't impinge too badly on smaller sample sizes, like 8-bit quality.
I think the smart pin will work like this:
(1) Input first 32 samples, accumulate {1..32} * highs.
(2) Accumulate 32's for highs during duration of WXPIN period.
(3) Input last 32 samples, accumulate {32..1} * highs.
(4) Report accumulator >> 5 to result.
I've done a lot of testing on this tonight, even making a closed loop to adjust the sample length, in order to drive the 1's delta between VIO and GIO measurements to constants like 255, in order to make mathless conversions. Didn't work well, due to frequent sample-length adjustments.
I think all we need is a smart pin mode to make windowed measurements for ADC sampling. That is not much logic. We can't automate ADC operation beyond the windowed measurement, though, because it gets too complex, too quickly. This means that the streamer won't have the planned ADC modes, because it would need to make windowed measurements for them to be of acceptable quality and that would over-complicate the streamer.
I'm really glad we've got the ADC working as it should, now. That Trumps a lot of the other stuff. ErNa?
There will be a next chance to salvage things, that once went wrong as we do not see the full picture. At least we know, there is no simple solution if it doesn't come from a strong genius ;-)
There will be a next chance to salvage things, that once went wrong as we do not see the full picture. At least we know, there is no simple solution if it doesn't come from a strong genius ;-)
I guess you can think of this as the average of 32 regular moving average filters, all centered on each other and with widths that increase by 2 samples...
I guess you can think of this as the average of 32 regular moving average filters, all centered on each other and with widths that increase by 2 samples...
Yes, they are centered on each other, but the top one has 32 bits in front of it and 1 bit behind it. The next one below has 31 bits in front of it and 2 bits behind it, and so on, down to the bottom one which as 1 bit in front of it and 32 bits behind it. Now, add up all the bits and divide by 32 to get your sample.
Well, I'm getting numbers. Not at all sure how to interpret them yet. The obvious observation is Sinc2 resolution is basically double the bit-depth of the others.
Chip,
That drawing is third-order, not second-order. I just noticed you called it second-order. So, to get second-order, only do two of the three shown stages. And even then only do two accumulators. Forget the decimator stages since software can do those easy.
That way you aren't making it any bigger than existing resources in the smartpins.
EDIT: Here's all that's needed
With a second-order filter, wouldn't we need a 2nd-order integrator? I still have no idea how these things work.
Consider the raw bitstream from the ADC to be a one bit stream of samples containing all jitter and no signal. You want to take the jitter pulses and spread them out into more smaller signal pulses. So you take an accumulator wider than one bit and accumulate these jitter bits into it. When the accumulator is sampled periodically and subtracted from its prior reading, this increment can be considered to have two components: a desirable signal component and an undesirable jitter component. This is what you do now with just a simple accumulator.
Originally, the 1-bit bitstream was all jitter and no signal. The first accumulator converts much of this jitter into signal. When you add another accumulator, this second accumulator converts some of the remaining jitter into more signal.
If you added this second accumulator that accumulated the results of the first accumulator, and then added another differentiator to compensate for the extra accumulator, it would leave the first-order signal's signal component pretty much unaffected, but would convert some of its jitter into more signal by the same process that the first one did so.
Run the following C code that compares first order to second order for 8-bit samples. It prints out two columns. The first shows 8-bit first-order samples with 1 LSB of jitter. The second shows 16-bit second-order values with more than one bit of jitter, but the jitter is in the bottom 8 bits, which isn't meaningful because only the top 8 bits have any meaning because we're only taking 256 samples. If you want 32-bit accumulators, change all the uint16_t's to uint32_t's and change the %04x's to %08x's.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv) {
// bitsream generation
uint16_t vacc_prev = 0;
uint16_t vin = 0x3678;
// accumulators
uint16_t acc1 = 0;
uint16_t acc2 = 0;
// sampling period counter
int samples = 0;
// previous values
uint16_t acc1_prev = 0;
uint16_t acc2_prev = 0;
uint16_t diff21_prev = 0;
uint16_t diff22_prev = 0;
uint16_t diff1_prev = 0;
for (int i = 0; i < 16384; i++) {
uint16_t vacc = vacc_prev + vin; // increase bitsream accumulator
uint16_t bit = vacc < vacc_prev ? 1 : 0; // detect carry
acc1 += bit; // update first accumulator
acc2 += acc1; // update second accumulator
if (++samples >= 256) {
// first-order differentiation of first accumulator
uint16_t diff1 = acc1 - acc1_prev;
// second-order differentiation of second accumulator
uint16_t diff21 = acc2 - acc2_prev;
uint16_t diff22 = diff21 - diff21_prev;
// first order, second order values
printf("%04x, %04x\n", diff1, diff22);
samples = 0; // reset sample count
// save differentiation variables for next time
acc1_prev = acc1;
acc2_prev = acc2;
diff21_prev = diff21;
diff22_prev = diff22;
}
// save carry accumulator for next time
vacc_prev = vacc;
}
return 0;
}
Here's five runs with differing parameters. Each run takes 20 readings (samples) and prints them consecutively.
First parameter is algorithm select. Second parameter is number of bits to read from the bitstream (chunk length). Third parameter is reset flag to zero the bitstream index and accumulators.
Note the two rolling runs have a little noise. This will be because the bitstream was filled with a 12-bit NCO, while the sampling is only 256 bits long. DC level is 0x523.
EDIT: Corrected first reading of rolling runs to start as a reset.
I think all we need is a smart pin mode to make windowed measurements for ADC sampling. That is not much logic. We can't automate ADC operation beyond the windowed measurement, though, because it gets too complex, too quickly. This means that the streamer won't have the planned ADC modes, because it would need to make windowed measurements for them to be of acceptable quality and that would over-complicate the streamer.
I thought this was going to be optional ?
Forcing any 'filter' and removing an ADC streamer ability, does not seem a step in the 'flexible P2' direction at all ?
Before encoding a single filter into silicon, a lot more testing and analysis is needed.
eg How does this vary with SysCLK - 250MHz seems a quite high test value which will have millivolts of charge cap variation, and thus be prone to supply ripple/ringing etc.
I am lost when it comes to understanding analog. I can lay out a pcb to be quiet for analog, but that's it.
Something seems amiss when you can effectively remove a group of first samples and last samples and end up with a superior result. Since you say the sampling is free running and has been before you start, the something is upsetting the results. Otherwise all the results would be noisy and you could just take any window of samples and you would get the same/similar results.
So the question is rather, what is causing those first and last sample groups to be poor?
What you have found is the result of a problem. Now it's time to find out the why. It's not the final silicon yet, so a workaround currently isn't the solution.
Just my observation.
I agree.
Harder to figure out, is even if there is clumping effects in first/last windows of 32, the mirror nature of the very small first/last windows will have varying effects, depending on the Phase/Frequency of that clumping.
ie the notch effect can shift with sample N, SysCLK, and imposed error signal.
I've added a Sinc3 now too, but even 4 kbits of bitstream is enough to break the 32-bit counters. EDIT: Sinc3 presumably tipples the bit depth. So, 12 * 3 = 36-bit counters required for 4 kbit sampling period.
I've added a Sinc3 now too, but even 4 kbits of bitstream is enough to break the 32-bit counters.
Do you have any numbers for predicted improvement in ENOB for each of the filters yet ? (eg vs start/end window size and N samples )
Is the noise here white noise ?
I've added a Sinc3 now too, but even 4 kbits of bitstream is enough to break the 32-bit counters. EDIT: Sinc3 presumably tipples the bit depth. So, 12 * 3 = 36-bit counters required for 4 kbit sampling period.
Can't you just truncate the bottom four (or more) bits on the third accumulator to make it fit in 32 bits? None of the bits below the top 12 or so really mean anything anyway and are just there to minimize quantization error.
Can't you just truncate the bottom four (or more) bits on the third accumulator to make it fit in 32 bits? None of the bits below the top 12 or so really mean anything anyway and are just there to minimize quantization error.
Probably, it would have to be done within the accumulator circuit though, and then it's not as per that drawing.
And it doesn't matter how long the sample run is, although past 6k bits, the 1/f noise become influential.
Noise is always there, spread over all the samples.
Jmg, rather than claim a 2-bit Improvement, I should just say that the windowing is restoring two bits that should have been there, alteady, but were being destroyed by cycling patterns.
I think all we need is a smart pin mode to make windowed measurements for ADC sampling. That is not much logic. We can't automate ADC operation beyond the windowed measurement, though, because it gets too complex, too quickly. This means that the streamer won't have the planned ADC modes, because it would need to make windowed measurements for them to be of acceptable quality and that would over-complicate the streamer.
I thought this was going to be optional ?
Forcing any 'filter' and removing an ADC streamer ability, does not seem a step in the 'flexible P2' direction at all ?
Before encoding a single filter into silicon, a lot more testing and analysis is needed.
eg How does this vary with SysCLK - 250MHz seems a quite high test value which will have millivolts of charge cap variation, and thus be prone to supply ripple/ringing etc.
Jmg, I've tested this across frequency, even with the 24MHz RCFAST clock, per Yanomani's request. It just works!
I think the best contemplative understanding is lent from the staggered-oversampling concept.
Notably, this algorithm totally fails if the order of the first and last 32 bits is tampered with. Those bits must be as they come from the ADC, in order.
There is a rational explanation for why this works, but none of us can exactly identify it. I'm pretty sure that it has to do with the tapering periods covering, at least, the <7-bit spans of rise-to-rise patterns in the ADC bitstream data.
Evanh, your testing looks interesting. I get the concept of one accumulator following another, but cannot figure out what that would do. Do you see any indication that we could get better than an 8-bit result out of 256 samples? I don't suppose that's possible, but maybe by some subtlety in the bitstream, more resolution could be sussed out. when I get back in the office and I have bigger monitors, I will look through your posts' data more more carefully.
Jmg, I've tested this across frequency, even with the 24MHz RCFAST clock, per Yanomani's request. It just works!
I think the best contemplative understanding is lent from the staggered-oversampling concept.
Notably, this algorithm totally fails if the order of the first and last 32 bits is tampered with. Those bits must be as they come from the ADC, in order.
There is a rational explanation for why this works, but none of us can exactly identify it. I'm pretty sure that it has to do with the tapering periods covering, at least, the <7-bit spans of rise-to-rise patterns in the ADC bitstream data.
Thanks Chip, for taking the burden of doing these extra tests.
After almost six days being forced to be in watch-only mode, thru a really small Android screen, now I'm back home, and a 15" seems a whole lot of a landscape to my eyes.
Now, I'm searching for some raw sample readings, because I want to peruse the bitstreams a little, with my own freshed-up eyes.
As long as there is a note about the sysclk value used in getting the sample bitstreams and some hint about the analog voltage or waveform being sampled, I believe they'll be enough for my needs.
Comments
It is a window function and seems to be closest to the Planck-taper window with a sharp rise and fall:
https://en.m.wikipedia.org/wiki/Window_function
A triangular window looks to be straightforward to implement. The code uses 512 longs of data and if some datasets were posted we could try different windows ourselves.
I think all we need is a smart pin mode to make windowed measurements for ADC sampling. That is not much logic. We can't automate ADC operation beyond the windowed measurement, though, because it gets too complex, too quickly. This means that the streamer won't have the planned ADC modes, because it would need to make windowed measurements for them to be of acceptable quality and that would over-complicate the streamer.
I'm really glad we've got the ADC working as it should, now. That Trumps a lot of the other stuff. ErNa?
I think so. We could go more or less than 32, but 32 seems like a happy medium that doesn't impinge too badly on smaller sample sizes, like 8-bit quality.
I think the smart pin will work like this:
(1) Input first 32 samples, accumulate {1..32} * highs.
(2) Accumulate 32's for highs during duration of WXPIN period.
(3) Input last 32 samples, accumulate {32..1} * highs.
(4) Report accumulator >> 5 to result.
Yes, what we have now seems very stable.
Yes, they are centered on each other, but the top one has 32 bits in front of it and 1 bit behind it. The next one below has 31 bits in front of it and 2 bits behind it, and so on, down to the bottom one which as 1 bit in front of it and 32 bits behind it. Now, add up all the bits and divide by 32 to get your sample.
Consider the raw bitstream from the ADC to be a one bit stream of samples containing all jitter and no signal. You want to take the jitter pulses and spread them out into more smaller signal pulses. So you take an accumulator wider than one bit and accumulate these jitter bits into it. When the accumulator is sampled periodically and subtracted from its prior reading, this increment can be considered to have two components: a desirable signal component and an undesirable jitter component. This is what you do now with just a simple accumulator.
Originally, the 1-bit bitstream was all jitter and no signal. The first accumulator converts much of this jitter into signal. When you add another accumulator, this second accumulator converts some of the remaining jitter into more signal.
If you added this second accumulator that accumulated the results of the first accumulator, and then added another differentiator to compensate for the extra accumulator, it would leave the first-order signal's signal component pretty much unaffected, but would convert some of its jitter into more signal by the same process that the first one did so.
Run the following C code that compares first order to second order for 8-bit samples. It prints out two columns. The first shows 8-bit first-order samples with 1 LSB of jitter. The second shows 16-bit second-order values with more than one bit of jitter, but the jitter is in the bottom 8 bits, which isn't meaningful because only the top 8 bits have any meaning because we're only taking 256 samples. If you want 32-bit accumulators, change all the uint16_t's to uint32_t's and change the %04x's to %08x's.
First parameter is algorithm select. Second parameter is number of bits to read from the bitstream (chunk length). Third parameter is reset flag to zero the bitstream index and accumulators.
Console output, in respective order:
Note the two rolling runs have a little noise. This will be because the bitstream was filled with a 12-bit NCO, while the sampling is only 256 bits long. DC level is 0x523.
EDIT: Corrected first reading of rolling runs to start as a reset.
And again but with chunk length set to 0x10000 (64 kbit):
We're out of resolution. Can't go any further without scaling first.
EDIT: Corrected first reading of rolling runs to start as a reset.
And this is the various filters, for those that can be bothered scrolling:
Full source code attached:
Forcing any 'filter' and removing an ADC streamer ability, does not seem a step in the 'flexible P2' direction at all ?
Before encoding a single filter into silicon, a lot more testing and analysis is needed.
eg How does this vary with SysCLK - 250MHz seems a quite high test value which will have millivolts of charge cap variation, and thus be prone to supply ripple/ringing etc.
I agree.
Harder to figure out, is even if there is clumping effects in first/last windows of 32, the mirror nature of the very small first/last windows will have varying effects, depending on the Phase/Frequency of that clumping.
ie the notch effect can shift with sample N, SysCLK, and imposed error signal.
Do you have any numbers for predicted improvement in ENOB for each of the filters yet ? (eg vs start/end window size and N samples )
Is the noise here white noise ?
Can't you just truncate the bottom four (or more) bits on the third accumulator to make it fit in 32 bits? None of the bits below the top 12 or so really mean anything anyway and are just there to minimize quantization error.
Probably, it would have to be done within the accumulator circuit though, and then it's not as per that drawing.
No idea. That's the mathy part.
Jmg, rather than claim a 2-bit Improvement, I should just say that the windowing is restoring two bits that should have been there, alteady, but were being destroyed by cycling patterns.
Jmg, I've tested this across frequency, even with the 24MHz RCFAST clock, per Yanomani's request. It just works!
I think the best contemplative understanding is lent from the staggered-oversampling concept.
Notably, this algorithm totally fails if the order of the first and last 32 bits is tampered with. Those bits must be as they come from the ADC, in order.
There is a rational explanation for why this works, but none of us can exactly identify it. I'm pretty sure that it has to do with the tapering periods covering, at least, the <7-bit spans of rise-to-rise patterns in the ADC bitstream data.
We should test with a variety of input signals. I will put that on my list for when I get a board.
Thanks Chip, for taking the burden of doing these extra tests.
After almost six days being forced to be in watch-only mode, thru a really small Android screen, now I'm back home, and a 15" seems a whole lot of a landscape to my eyes.
Now, I'm searching for some raw sample readings, because I want to peruse the bitstreams a little, with my own freshed-up eyes.
As long as there is a note about the sysclk value used in getting the sample bitstreams and some hint about the analog voltage or waveform being sampled, I believe they'll be enough for my needs.
Henrique