Chip,
Here's a synthesised ramp without any intended noise, although the nature of the NCO size may well be inducing beating. These were all run on the same dataset with the same window sizes.
The trapezoid is clearly straighter than the Sinc1. But it should really be compared to the Sinc2 because that's the equivalent complexity. And for that the increased resolution of the Sinc2 really stands out.
It all falls down, though, at the non-rolling Sinc2, which I know we want to do. The non-rolling Sinc2 has the same proportion of ripple as the Sinc1 does!
Anyway, the acquisition time on these 8-bit-quality samples was 256 clocks. Windowing was performed on the first 29 and last 29 bits in the ADC stream.
Chip,
Here's a synthesised ramp without any intended noise, although the nature of the NCO size may well be inducing beating. These were all run on the same dataset with the same window sizes.
The trapezoid is clearly straighter than the Sinc1. But it should really be compared to the Sinc2 because that's the equivalent complexity. And for that the increased resolution of the Sinc2 really stands out.
It all falls down, though, at the non-rolling Sinc2, which I know we want to do. The non-rolling Sinc2 has the same proportion of ripple as the Sinc1 does!
What does it mean? I wish I understood. Do you have any indication that we could get more dynamic samples in shorter time, maybe more resolution?
Yes, maximizing the 64's is good, because we get more sampling.
I looked up Logic Friday. Looks neat. I'm pretty sure the ASIC compiler does all that stuff automatically, in order to minimize logic.
Logic Friday uses espresso to minimize logic and I have some PLD software that does the same, which I'll run as a check as I don't trust Logic Friday entirely from past experience.
The window we have now is Tukey32, with cos 0 to cos+/-180 divided into 32 equal steps. The first two and the last two values are very close to 1 or 0. I could look at a Tukey36 later and discard four outputs. Chi-Square tests to compare the errors between actual and integer values for the two windows would tell us which is best.
I just wanted to note that at first I was confused about the filtering, thinking you were talking about filtering the ADC values. Once I understood that you were talking about weighting the incoming bitstream, windowing the first and last samples, it made a lot more sense to me. Hopefully no one else is mentally stuck like I was. [8^)
What does it mean? I wish I understood. Do you have any indication that we could get more dynamic samples in shorter time, maybe more resolution?
That was all done solely on 256 clocks per sample. From a speed perspective, all the graph lines were on equal footing. So, yes, totally more resolution.
What does it mean? I wish I understood. Do you have any indication that we could get more dynamic samples in shorter time, maybe more resolution?
That was all done solely on 256 clocks per sample. From a speed perspective, all the graph lines were on equal footing. So, yes, totally more resolution.
I love the notion, but how in the heck does it work? In 256 bit samples you can have 0..256 ones. How do you get more resolution from those bits? They can be mixed around in different patterns, but there's still only 256 of them.
I just wanted to note that at first I was confused about the filtering, thinking you were talking about filtering the ADC values. Once I understood that you were talking about weighting the incoming bitstream, windowing the first and last samples, it made a lot more sense to me. Hopefully no one else is mentally stuck like I was. [8^)
Jonathan
Thanks for pointing this out, because I don't want people to be confused.
This only involves the first 32 and the last 32 bits in the ADC bitstream, for ANY-sized sample.
All these tables look expensive. Can you try two accumulators clocked at the bit rate and two differentiators clocked at the sample rate and compare the performance and cost of that to that of this windowing?
All these tables look expensive. Can you try two accumulators clocked at the bit rate and two differentiators clocked at the sample rate and compare the performance and cost of that to that of this windowing?
We need to understand the SINC filter stuff better, but this windowing is working much better than raw sampling.
Those tables that TonyB_ made are actually pretty efficient. I think the pure logic version is no bigger than the counter version would be.
That logic is really not much. There's a lot of common sub-terms that would make it even smaller. I say we just go with the lookup table. It's much more certain than something we'll never be able to see working because it's buried.
That logic is really not much. There's a lot of common sub-terms that would make it even smaller. I say we just go with the lookup table. It's much more certain than something we'll never be able to see working because it's buried.
Yes, give Verilog a table with decimal values to minimize. Chip, what do you think of the add with carry option? You'll need to subtract 1 from the actual increments.
That logic is really not much. There's a lot of common sub-terms that would make it even smaller. I say we just go with the lookup table. It's much more certain than something we'll never be able to see working because it's buried.
Yes, give Verilog a table with decimal values to minimize. Chip, what do you think of the add with carry option? You'll need to subtract 1 from the actual increments.
I'm thinking that it might be good to investigate a 16-sample window with less resolution. While this windowing is critical, we are only talking a handful of bits in a much larger sample. I think we could give them less resolution on X and Y and it would probably be just fine. Like, instead of a 5x6, maybe a 4x4 would be sufficient.
...I see it's hard to get much cosine definition with only 16 levels.
More pictures. These are details of 4096-sample conversions (12-bit) with the ADC receiving a finely-stepped ramp.
First, the straight accumulator approach (no windowing). The monitor DAC was wrapping vertically because of the noise amplitude:
It looks like the windowing is getting rid of sporadic +1/+2 contributions from the initial and terminal bits samples.
So, could you tell it a way, I can crasp what I see: The analog input voltage of the input is a staircase function?
You read out the counter at a given period?
You substract the last read value from the current value?
If I didn't know it better, I would think, you look to the future as you start and to the past as you stop reading ;-)
OK, it's 4096 clocks per sample. At a clock rate of 100 MHz it's about 25 KHz sampling frequency.
Is this still a slowly ramping analog input voltage? Is the spiked signal always 16 Bits (as if Bit #5 is toggling?)
What happens if you measure a slightly longer period?
The spikes are independend of the signal level you start with? What happens if you execute a series of measurement? Is the effect repeated and if you pause between different bursts, is there a minimal pause time to have the effect?
And just another weird idea: couldn't it be, that the spikes are an overflow of the scope or the ADC that output the signal?
That logic is really not much. There's a lot of common sub-terms that would make it even smaller. I say we just go with the lookup table. It's much more certain than something we'll never be able to see working because it's buried.
Yes, give Verilog a table with decimal values to minimize. Chip, what do you think of the add with carry option? You'll need to subtract 1 from the actual increments.
I'm thinking that it might be good to investigate a 16-sample window with less resolution. While this windowing is critical, we are only talking a handful of bits in a much larger sample. I think we could give them less resolution on X and Y and it would probably be just fine. Like, instead of a 5x6, maybe a 4x4 would be sufficient.
...I see it's hard to get much cosine definition with only 16 levels.
If the 32 increments are symmetrical about the midpoint, we might be able to use 4x7 or 4x6 with perhaps a carry input for half the levels. In other words, we have only 16 levels in the logic but we end up with 32 different increments. I think reducing the levels to 16 would affect the quality, but I'll have a look at the increment values later.
If the ramps were symmetrical up and down so that increment(x) ascending + increment(x) descending = 64, then the windows could be overlapped by 32 bits and every bit in the stream would have equal weight. Doing this in hardware might be impossible as the 32 bits to use again would have passed through already.
To all those who are still troubled about what we are doing at the moment:
There must be a windowing function, either in hardware or software, to attenuate the samples at the ends of the sampling window. The mathematical assumption is that the actual waveform consists of what is in the sampling window, repeated from -∞ to +∞ in the time domain. The windows must start and end with the same value, usually zero, to avoid discontinuities. A windowing function can reduce noise in a steady or slowly-varying signal as Chip's photos prove but it is not the same thing as a filter.
Erna, there is a problem with abruptly starting and stopping the sampling of an ADC bitstream.
Within the bitstream there are always 2..7-bit-long cyclings going on. They are like fibers in a tree branch. They need to be gently attenuated at both the start and at the end of the measurement, in order to avoid their prickliness, which will add noise counts to your final reading.
Here is what the bitstream looks like when it's abruptly broken into:
You can imagine that to deal with that break, you'd need to clean up the ends. That's what the windowing does. It tapers the ends down and gets rid of the sporadic splinters that would have eroded the quality of your measurement.
I would just cut away head and tail, because it doesn't matter, if you measure a signal not 100% of the time is you cut it in junks anyway. This adc is not sampling, but integrating. If the signal changed within the measuring time, you only get an average value and know little about the factical slope of the signal.
Any type of filter influences the signal and the filter for the problem you have is quite simple: The slope of the streaming data is limited to a plausible max. Whenever the limit is broken, replace the signal by the last valid value.
If cutting away is not a choise, why not having the streaming constantly switched on and sampling the streamed data? Like: setting a streaming counter, adding all the streaming values over a given time and just spilling out the accumulator.
In my application on motor control I normally do not sample the counter equidistant in time, but triggered by a certain event. The reason is: if a motor rotates, the integral of the voltage at constant current is independent of rotational speed, so I measure once per rotation, not at constant time interval.
I would just cut away head and tail, because it doesn't matter, if you measure a signal not 100% of the time is you cut it in junks anyway. This adc is not sampling, but integrating. If the signal changed within the measuring time, you only get an average value and know little about the factical slope of the signal.
If cutting away is not a choise, why not having the streaming constantly switched on and sampling the streamed data?
Yes, constant sampling would be an alternative. I'm hoping EvanH comes up with some ideas that we can employ.
Those tables that TonyB_ made are actually pretty efficient. I think the pure logic version is no bigger than the counter version would be.
That bit I am not quite following ?
You still need a counter at the ends, as the total sample time is fully variable, but the window ends are fixed
I would also be wary of pure logic, as that can add delays which the larger 32b adder might not tolerate. ie those big adders are likely already pushing timing. Better to have the window values emitted from a register, so it behaves like a registered ROM/State engine in the timing.
Also, the NCO smart pin mode already has a 32b adder - is this new mode coded to carefully reuse that existing adder ?
I'm thinking that it might be good to investigate a 16-sample window with less resolution. While this windowing is critical, we are only talking a handful of bits in a much larger sample. I think we could give them less resolution on X and Y and it would probably be just fine. Like, instead of a 5x6, maybe a 4x4 would be sufficient.
Certainly, because this duplicates 64 times, the size needs to be pruned to the absolute minimum.
Chip, it sounds like if you let the ADCs free run, they will become slower and slower to respond to a voltage change; the older samples weight greater than the newer samples, correct? Or do all the samples weight evenly? I assume, because it's an accumulator, and bits march left, the older data will have more effect than newer data, assuming you have a steady signal during the sample window.
If the sample window is extended and the data is changing radically, the newer samples have the ability to out-vote the older samples, or more likely, the sample ends up becoming a rough average of the quickly changing values.
If you left the counters in free run mode, for slowly changing data, the older samples would weight the data, but also you'd end up overflowing the accumulators, so you MUST reset them periodically.
I assume that resetting the accumulator subjects it to the "prickles" of the head and tail ends of the bitstream?
Ultimately, you're accumulating samples, and the samples that are the oldest are the heaviest in weight, because they are furthest left in the accumulator. The incomplete first sample throws off the initial accumulation? How is the final sample perturbing the accumulation as well?
So I take it all the previously regarded low freq noise problem is actually an A/D architecture problem, and there is nothing wrong with the analog circuitry.
Woudn't any off the shelf Sigma delta chip, that outputs discreet samples, also have the same digital averaging hardware that is being proposed?
So I take it all the previously regarded low freq noise problem is actually an A/D architecture problem, and there is nothing wrong with the analog circuitry.
Not quite, there are many sources of noise and error and spread, this is merely one of many.
The internal noise sources are still there, as are process and temperature, this filter allows lower sample rates, with lower LSB jitter
The jury is still out on the benefit vs logic cost trade off.
Woudn't any off the shelf Sigma delta chip, that outputs discreet samples, also have the same digital averaging hardware that is being proposed?
They have various filters, but it is common to see the more filtering, the longer the settling times. IIRC some SDM need 3 readings before the 'old' data is fully replaced with 'new' data.
So they are less suited to multiplex use, and multiplex is one proposal Chip has to better process and temperature compensate the ADC
Comments
Removing the zeroes and having four 64s saves a product term, so this is the way to go. Also it means four more samples that are not attenuated.
If we used an add with carry perhaps we could have 6-bit stateless output 0..63 instead of 1..64?
I looked up Logic Friday. Looks neat. I'm pretty sure the ASIC compiler does all that stuff automatically, in order to minimize logic.
Here's a synthesised ramp without any intended noise, although the nature of the NCO size may well be inducing beating. These were all run on the same dataset with the same window sizes.
The trapezoid is clearly straighter than the Sinc1. But it should really be compared to the Sinc2 because that's the equivalent complexity. And for that the increased resolution of the Sinc2 really stands out.
It all falls down, though, at the non-rolling Sinc2, which I know we want to do. The non-rolling Sinc2 has the same proportion of ripple as the Sinc1 does!
Tukey-windowed 256-sample sine wave:
You can see the difference at 1x.
Anyway, the acquisition time on these 8-bit-quality samples was 256 clocks. Windowing was performed on the first 29 and last 29 bits in the ADC stream.
What does it mean? I wish I understood. Do you have any indication that we could get more dynamic samples in shorter time, maybe more resolution?
Logic Friday uses espresso to minimize logic and I have some PLD software that does the same, which I'll run as a check as I don't trust Logic Friday entirely from past experience.
The window we have now is Tukey32, with cos 0 to cos+/-180 divided into 32 equal steps. The first two and the last two values are very close to 1 or 0. I could look at a Tukey36 later and discard four outputs. Chi-Square tests to compare the errors between actual and integer values for the two windows would tell us which is best.
This is the raw bitstream data I used for the above graphs, stored as binary 32-bit Intel little-endian. So it's all Smile about face.
It's synthesised and not what Chip is working with but is, never the less, a valid ramping bitstream.
Jonathan
That was all done solely on 256 clocks per sample. From a speed perspective, all the graph lines were on equal footing. So, yes, totally more resolution.
EDIT: For a Sinc2 with rolling window that is.
I love the notion, but how in the heck does it work? In 256 bit samples you can have 0..256 ones. How do you get more resolution from those bits? They can be mixed around in different patterns, but there's still only 256 of them.
It only works as a rolling operation. So yeah, something like that I guess.
Thanks for pointing this out, because I don't want people to be confused.
This only involves the first 32 and the last 32 bits in the ADC bitstream, for ANY-sized sample.
It'll be something about the difference between square and triangle I presume.
Tukey32
Tukey window with 32 steps from cos 0 to cos +/-180
5-bit to 7-bit, 1..64, product terms for X[6:0]=2,4,6,6,7,6,6 or
5-bit to 6-bit, 0..63, product terms for X[6:0]=0,2,6,8,9,7,6
or
Logic Friday has a strange syntax for its equations and I have not verified them yet with my other program.
EDIT:
Truth tables:
or
We need to understand the SINC filter stuff better, but this windowing is working much better than raw sampling.
Those tables that TonyB_ made are actually pretty efficient. I think the pure logic version is no bigger than the counter version would be.
That logic is really not much. There's a lot of common sub-terms that would make it even smaller. I say we just go with the lookup table. It's much more certain than something we'll never be able to see working because it's buried.
Yes, give Verilog a table with decimal values to minimize. Chip, what do you think of the add with carry option? You'll need to subtract 1 from the actual increments.
I'm thinking that it might be good to investigate a 16-sample window with less resolution. While this windowing is critical, we are only talking a handful of bits in a much larger sample. I think we could give them less resolution on X and Y and it would probably be just fine. Like, instead of a 5x6, maybe a 4x4 would be sufficient.
...I see it's hard to get much cosine definition with only 16 levels.
You read out the counter at a given period?
You substract the last read value from the current value?
If I didn't know it better, I would think, you look to the future as you start and to the past as you stop reading ;-)
OK, it's 4096 clocks per sample. At a clock rate of 100 MHz it's about 25 KHz sampling frequency.
Is this still a slowly ramping analog input voltage? Is the spiked signal always 16 Bits (as if Bit #5 is toggling?)
What happens if you measure a slightly longer period?
The spikes are independend of the signal level you start with? What happens if you execute a series of measurement? Is the effect repeated and if you pause between different bursts, is there a minimal pause time to have the effect?
And just another weird idea: couldn't it be, that the spikes are an overflow of the scope or the ADC that output the signal?
If the 32 increments are symmetrical about the midpoint, we might be able to use 4x7 or 4x6 with perhaps a carry input for half the levels. In other words, we have only 16 levels in the logic but we end up with 32 different increments. I think reducing the levels to 16 would affect the quality, but I'll have a look at the increment values later.
If the ramps were symmetrical up and down so that increment(x) ascending + increment(x) descending = 64, then the windows could be overlapped by 32 bits and every bit in the stream would have equal weight. Doing this in hardware might be impossible as the 32 bits to use again would have passed through already.
To all those who are still troubled about what we are doing at the moment:
There must be a windowing function, either in hardware or software, to attenuate the samples at the ends of the sampling window. The mathematical assumption is that the actual waveform consists of what is in the sampling window, repeated from -∞ to +∞ in the time domain. The windows must start and end with the same value, usually zero, to avoid discontinuities. A windowing function can reduce noise in a steady or slowly-varying signal as Chip's photos prove but it is not the same thing as a filter.
Within the bitstream there are always 2..7-bit-long cyclings going on. They are like fibers in a tree branch. They need to be gently attenuated at both the start and at the end of the measurement, in order to avoid their prickliness, which will add noise counts to your final reading.
Here is what the bitstream looks like when it's abruptly broken into:
You can imagine that to deal with that break, you'd need to clean up the ends. That's what the windowing does. It tapers the ends down and gets rid of the sporadic splinters that would have eroded the quality of your measurement.
Any type of filter influences the signal and the filter for the problem you have is quite simple: The slope of the streaming data is limited to a plausible max. Whenever the limit is broken, replace the signal by the last valid value.
If cutting away is not a choise, why not having the streaming constantly switched on and sampling the streamed data? Like: setting a streaming counter, adding all the streaming values over a given time and just spilling out the accumulator.
In my application on motor control I normally do not sample the counter equidistant in time, but triggered by a certain event. The reason is: if a motor rotates, the integral of the voltage at constant current is independent of rotational speed, so I measure once per rotation, not at constant time interval.
Yes, constant sampling would be an alternative. I'm hoping EvanH comes up with some ideas that we can employ.
That bit I am not quite following ?
You still need a counter at the ends, as the total sample time is fully variable, but the window ends are fixed
I would also be wary of pure logic, as that can add delays which the larger 32b adder might not tolerate. ie those big adders are likely already pushing timing.
Better to have the window values emitted from a register, so it behaves like a registered ROM/State engine in the timing.
Also, the NCO smart pin mode already has a 32b adder - is this new mode coded to carefully reuse that existing adder ?
Certainly, because this duplicates 64 times, the size needs to be pruned to the absolute minimum.
If the sample window is extended and the data is changing radically, the newer samples have the ability to out-vote the older samples, or more likely, the sample ends up becoming a rough average of the quickly changing values.
If you left the counters in free run mode, for slowly changing data, the older samples would weight the data, but also you'd end up overflowing the accumulators, so you MUST reset them periodically.
I assume that resetting the accumulator subjects it to the "prickles" of the head and tail ends of the bitstream?
Ultimately, you're accumulating samples, and the samples that are the oldest are the heaviest in weight, because they are furthest left in the accumulator. The incomplete first sample throws off the initial accumulation? How is the final sample perturbing the accumulation as well?
Woudn't any off the shelf Sigma delta chip, that outputs discreet samples, also have the same digital averaging hardware that is being proposed?
EDIT- (for a benchmark of 12 bit ADC operation)
The internal noise sources are still there, as are process and temperature, this filter allows lower sample rates, with lower LSB jitter
The jury is still out on the benefit vs logic cost trade off.
They have various filters, but it is common to see the more filtering, the longer the settling times. IIRC some SDM need 3 readings before the 'old' data is fully replaced with 'new' data.
So they are less suited to multiplex use, and multiplex is one proposal Chip has to better process and temperature compensate the ADC
If he sees an improvement, and it's a selectable filter, then I see nothing wrong with what he's proposing.
I wish people would stop giving him a hard time and let him get on with it, so he can be finished by the deadline.