I've got the Sinc2 ADC mode working in the smart pins. It's settable for 7..14-bit conversions.
Look at the lower 8 bits of this 14-bit conversion. Almost no noise. It would take 63 more scope screens stacked vertically to see the whole sine wave:
Is that with the real P2 ADC data feeding it ? What HW test setup ?
Can you plot Sinc3, also fed from a real ADC, as a comparison ?
What bit-counts does this cover ? I presume this can also be fed from an external pin ?
Could someone set this up to stream at some modest #bits rate, and then apply another filter in software for 16/18/20 bits, depending on where the system noise floor was ?
Yes. That was the P2 pin test board on the FPGA. I was feeding it from a quiet 3.3 volt regulator on the new P2 eval board. It's real.
I was looking at the sinc3 filter today, after I got the sinc2 working. It is amazing what the sinc3 can do with dynamic signals, but it does not have enough base information to know DC very accurately. These new sinc2 modes are like instrumentation modes for the ADC, whereas the sinc3 is like a dynamic mode for changing signals. Because of rapid bit growth, the sinc3 can only do 1K clocks before it fills up. The sinc2 is going to get a slight makeover and go to 16 bits, taking 32K clocks to achieve a conversion. I am anxious to see what a 16-bit sinc2 conversion looks like.
Have you considered doing a 1024 point sinc2; and then take the output of that and run it through one of Saucy's cascaded moving average filters with a [1,6,15,20,20,15,6,1] kernel; or equivalent; that is to say if you have a working version of cma that runs on hardware. [1,6,15,20,20,15,6,1] sums to 64, and with 10 bit input; where every bit is counted and you have counted every bit - you should get full coverage as far as "missed codes" are concerned; but the nice part is you don't have to wait 32K to read the latest long count; just like the stock market where you have new 100 day moving average at the close of every trading day.
Lazarus, that would be great, but it would need to be done in software, as there are not that many flops or special memories to put to use for such a purpose. What's going on lives inside the smart pin, which only has a hundred and twelve flops. Connections to those flops are determined by the pin mode selected.
I was looking at the sinc3 filter today, after I got the sinc2 working. It is amazing what the sinc3 can do with dynamic signals, but it does not have enough base information to know DC very accurately. These new sinc2 modes are like instrumentation modes for the ADC, whereas the sinc3 is like a dynamic mode for changing signals. Because of rapid bit growth, the sinc3 can only do 1K clocks before it fills up. The sinc2 is going to get a slight makeover and go to 16 bits, taking 32K clocks to achieve a conversion. I am anxious to see what a 16-bit sinc2 conversion looks like.
Another path to 16b, if that is too costly fully in HW, would be to manage the 14b in HW, and add a SW filter ?
I presume the sinc2 filter can work with external CLK.DAT signals, as does sinc3 ?
Lazarus, that would be great, but it would need to be done in software, as there are not that many flops or special memories to put to use for such a purpose. What's going on lives inside the smart pin, which only has a hundred and twelve flops. Connections to those flops are determined by the pin mode selected.
There will be solutions that are combinations of the HW filters, and SW post-processing ?
The easiest to use modes will be the ones supported by hardware, but higher performance / combination cases can be mixed, and the HW filters can lower the data rates down to where SW can manage.
I am going to combine the sinc2 and sinc3 smart pin modes into one configurable mode. It will support external clocking.
Here is an issue, though... In order to fit everything, I can't have a random reload counter. I need to fix conversions to 2^n clocks. Is that a problem?
Lazarus, that would be great, but it would need to be done in software, as there are not that many flops or special memories to put to use for such a purpose. What's going on lives inside the smart pin, which only has a hundred and twelve flops. Connections to those flops are determined by the pin mode selected.
I suppose that the question I should be asking then is what to expect in the hardware and at what rate? Are we going to get sinc2 summed to 6 or 7 bits (including the overflow) at sysclock/4, or sysclock/8 or sysclock/16; with or without overlap; with or without sinc3 of some flavor - which should work quite nicely and would make very happy, since I know that FFT is already possible for audio processing; even without the CORDIC or Goertzel modes, yet having a faster processor with pipelined CORDIC alone should make orders of magnitude better FFT performance possible. I just don't want to be stuck with a hardwired A/D that only gives me a sample once every 32768 clocks; and where I cant get under to the hood, so to speak so as to be able to run my own algorithms, which I know will run even a P1 - it is just the fact that there are the various bottlenecks - like not having USB 2.0 at even 12MBits; which would allow me to do things like ADAT transcoding; even if that is not propeller one's fault, but if it is just the way that the FTDI chip works, etc. -- and more pins!!! I am sure you know the run down.
Otherwise, I think that some of the cma(x,y,z) stuff would be a great OBEX software cog, as it is probably premature to think about asking for a spin instruction that just takes the parameters and runs with it in optimized PASM; just as it would also be premature to think of asking for FFT(x) as something that's easy to use and precompiled in the spin core, since some people might want or need a polyphase filter instead, or a chirp, z, and there are bazillion FFT's out there free for the downloading.
Lazarus, I will make the sinc modes open, so that in some sub mode you can just grab the summations. That would allow people to do whatever else they want. Good thinking.
I have found that, in practice, it is important to take a big bite on these sinc summations. Small ones have almost no filtering effect.
I am going to combine the sinc2 and sinc3 smart pin modes into one configurable mode. It will support external clocking.
Here is an issue, though... In order to fit everything, I can't have a random reload counter. I need to fix conversions to 2^n clocks. Is that a problem?
Hmm, that's unfortunate, as the SysCLK may be constrained by more important things, and it is nice to be able to reject certain frequencies like 50Hz or 60Hz mains, or other intruding frequencies.
Maybe it can be made 'near enough' by using a nearest count in the SW side ?
eg taking your 14bit example, that runs I think at 102.4us sample rate so 976.5625 of those is 10 sps, needing 977 as nearest whole number average for a notch.
or 163 or 195 whole numbers, for someone focusing on higher sps and 60Hz or 50Hz rejects.
Looks to give > 55dB error from whole cycle times, so maybe that's good enough ?
Of course, but at some point you need to make the cut, and some things are done in HW, and some are done in SW. It is important to keep a careful eye on HW costs, to avoid blow-outs.
I think it's now worth testing how these sinc2 & sinc3 & scope filters work with some software situations.
Can you do a table to summarize the ADC filters and bit-ranges and samples rates of the filters that you have packed into the smart pin, & their tested status ?
Of course, but at some point you need to make the cut, and some things are done in HW, and some are done in SW. It is important to keep a careful eye on HW costs, to avoid blow-outs.
I think it's now worth testing how these sinc2 & sinc3 & scope filters work with some software situations.
Can you do a table to summarize the ADC filters and bit-ranges and samples rates of the filters that you have packed into the smart pin, & their tested status ?
I've got the 16-bit-sample Sinc2 working, but it's really not worth it. When measurements become that long (32K clocks), slow noise starts having an influence. 15-bit sampling has about half the problem, while 14-bit sampling is the last realistic stop before things gets sloppy.
I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.
I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.
I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.
I realized my Sinc2 SmallBASIC program wasn't handling intb clearing properly. It was zero'ing intb at decimation time, instead of putting inta into it (as opposed to normally, intb = intb + inta. There were also some other order-of-operation issues that I fixed, so that it now relates exactly to how the hardware works. I also found that by initializing acc to 0.25, you get perfect full-run results. And I had to undo Jmg's time-saving 'range/2' change to make it work right, due to some order-of-operation matters that I fixed:
enob = 5 'effective number of bits 5..16
maxenob = 16
bitmask = pow(2,maxenob*2-1)-1
decimate = pow(2,enob-1)
range = pow(2,enob)
acc = rnd 'expose flaws
acc = 0.25 'make perfect
for g = 0 to range
for iter = 0 to range*2
acc = acc + g/range 'determine new ADC bit
if acc >= 1 then
acc = acc - 1
t = pow(2,16-enob) 'variably shift up ADC bit before integration to avoid shifting final sample
else
t = 0
endif
if iter/decimate = int(iter/decimate) then
result = ((intb - diff) band bitmask) 'compute result
sample = int((result+pow(2,13))/pow(2,14)) 'round and shift down to make sample
if iter >= range then circle 10+(g mod 256)*4 + int(g/256)*12, 10+((range-sample) mod 256)*4, 2 filled
diff = intb
intb = inta
else
intb = (intb + inta) band bitmask
endif
inta = (inta + t) band bitmask
next iter
print sample,
next g
I've got the 16-bit-sample Sinc2 working, but it's really not worth it. When measurements become that long (32K clocks), slow noise starts having an influence. 15-bit sampling has about half the problem, while 14-bit sampling is the last realistic stop before things gets sloppy.
I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.
I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.
I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.
The search for >14-bit is probably futile. The TI AMC1035 has a differential-inputs, chopped with a second-order modulator and in Sinc3 mode at decimation rate 512 the ENOB is ~13.75.
Merging Sinc2 and Sinc3 makes complete sense, as the third integrator adder for the latter can be repurposed as the differentiator for the former (with room to spare). One of us should have thought of that earlier.
Is the load counter necessary? Couldn't the output be read at any time with config bits specifying the scaling/bit shifting? It would be down to the user to select the correct scaling but it would permit rounding.
I've got the 16-bit-sample Sinc2 working, but it's really not worth it. When measurements become that long (32K clocks), slow noise starts having an influence. 15-bit sampling has about half the problem, while 14-bit sampling is the last realistic stop before things gets sloppy.
I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.
I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.
I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.
The search for >14-bit is probably futile. The TI AMC1035 has a differential-inputs, chopped with a second-order modulator and in Sinc3 mode and decimation rate 512 the ENOB is ~13.75. I think I've seen a Sinc4 mode that didn't get past 14.
Merging Sinc2 and Sinc3 makes complete sense, as the third integrator adder for the latter can be repurposed as the differentiator for the former (with room to spare). One of us should have thought of that earlier.
Is the load counter necessary? Couldn't the output be read at any time with config bits specifying the scaling/bit shifting? It would be down to the user to select the correct scaling but it would permit rounding.
The load counter can publish and zero the final integrator on every Nth clock, giving you a whole period in which to pick up the result before it's overwritten again. Jmg thought it would not be a good idea to fix the sample-count possibilities to the obvious 2^n points, as this would preclude custom clock counts, which I understand. We will be able to read the final integrator at any time if we set the time frame to 1 count, which will inhibit the zero'ing operation, so you can keep track, yourself.
I think it would help with testing if a couple of real bitstreams were posted as binary files.
Will scope mode still have four different filters?
Oh, I want to also have a mode that just shifts in 32 consecutive ADC bits and publishes them every 32 clocks. That way, it will be really simple to get a bitstream recorded for custom processing. This could record 8 pins in a way much more useful than the streamer could:
wrfast #0,adc_data 'ready to write adc data
dirh #7<<6 + PinA 'enable all smart pins at the same time
waitx #32-2-2 'wait for data, accomodate WAITX and REP
rep #16,sets_of_32 'record 8 pins in parallel time
rdpin x,#pinA '2 clocks
wflong x '2
rdpin x,#pinB '2
wflong x '2
rdpin x,#pinC '2
wflong x '2
rdpin x,#pinD '2
wflong x '2
rdpin x,#pinE '2
wflong x '2
rdpin x,#pinF '2
wflong x '2
rdpin x,#pinG '2
wflong x '2
rdpin x,#pinH '2
wflong x '2
Scope mode can still have four different filters. Something like ~22, ~32, ~48, and ~64-tap modes. Scope mode is all about phase, not so much about level. I should have realized that earlier.
I think it would help with testing if a couple of real bitstreams were posted as binary files.
Will scope mode still have four different filters?
Oh, I want to also have a mode that just shifts in 32 consecutive ADC bits and publishes them every 32 clocks. That way, it will be really simple to get a bitstream recorded for custom processing.
Yes, that's essential. I've thought about a single-bit scope mode in which each byte is eight successive ADC bits from a particular pin, so that four pins'-worth of 1-bit data can be streamed together. Running at 1/8th sysclock might be tricky, though.
Scope mode can still have four different filters. Something like ~22, ~32, ~48, and ~64-tap modes. Scope mode is all about phase, not so much about level. I should have realized that earlier.
I agree with that sort of spacing. Testing is needed to see how smaller sums compare to larger ones, bearing in mind result is 8-bit. How many shift bits are available now?
Does that includes zeroes at start and end, therefore 66 non-zero taps?
My previous post was written before seeing the edited post above it.
I think we should use the double-integrating technique, since it affords us more control over the output value. The triple-integrating technique is neat, but we can't control the output value very well, and it may require more flops, anyway.
We may already have the tap sets. Remember the ones that ended with zeroed LSB's, so that when we shifted them down, they landed at 255?
It occurred to me yesterday that if we had a second-order modulator, instead of a simple first-order modulator, our scope mode would not work. Does anyone suppose otherwise?
I wish I had time to do some SPICE work. I really want to see if adding many parallel first-order modulators with different current offsets would get around the problems of the first order, but provide both a higher number of bits per clock, while still affording something like a simple scope mode that looks at a small range of samples.
Does that includes zeroes at start and end, therefore 66 non-zero taps?
My previous post was written before seeing the edited post above it.
I think we should use the double-integrating technique, since it affords us more control over the output value. The triple-integrating technique is neat, but we can't control the output value very well, and it may require more flops, anyway.
We may already have the tap sets. Remember the ones that ended with zeroed LSB's, so that when we shifted them down, they landed at 255?
There were four windows that used double-integration with sums of 256, 510, 1020 and 2040. I'll have a look for them. Double-integration needs a tap when the first-derivative changes, which means slopes of +/- 1/2/3 require 12 taps. Triple integration needs a tap when the second-derivative changes, so we can have upslope increasing/constant/decreasing, zero slope, downslope increasing/constant/decreasing and end of window with 8 taps. The constant slopes occur in the middle of the ramps and are very similar to the linear mid-ranges of Hann or Tukey windows. Also, the sum is known in advance with CMA(x,y,z) to be x*y*z and the total number of adder bits will be set by Sinc3 mode. I think ultimately it will come down to which level of integration requires the least logic.
Did you look into "noise shaping"?
Appears to me that this is done by adding an "integrator" before the comparator.
But, sounds like this integrator is just a capacitor...
I realized my Sinc2 SmallBASIC program wasn't handling intb clearing properly. It was zero'ing intb at decimation time, instead of putting inta into it (as opposed to normally, intb = intb + inta. There were also some other order-of-operation issues that I fixed, so that it now relates exactly to how the hardware works. I also found that by initializing acc to 0.25, you get perfect full-run results. And I had to undo Jmg's time-saving 'range/2' change to make it work right, due to some order-of-operation matters that I fixed:
If it now updates at only range rate, it could pay to double check SNR against a simple gated counter, which has the same update rate.
IIRC the filter should manage the clumping ends effect better ?
I'd also like to see a dual plot of sinc2 vs sinc3, when fed with the same P2-SDM data
Whatever filter you settle on, be sure to check it again with a DC input. Some low-pass filters have ripples in the passband that do not intersect the Y axis at the 0 dB mark. (As long as your coefficients are all positive, and the filtering operation involves no subtractions, you should be safe.)
Did you look into "noise shaping"?
Appears to me that this is done by adding an "integrator" before the comparator.
But, sounds like this integrator is just a capacitor...
The first-order modulator in the P2 does noise shaping (but a second-order can do it better).
I've got the 16-bit-sample Sinc2 working, but it's really not worth it. When measurements become that long (32K clocks), slow noise starts having an influence. 15-bit sampling has about half the problem, while 14-bit sampling is the last realistic stop before things gets sloppy.
If you are hitting a definite 1/f noise corner, keep in mind the FPGA tests you run are at 80MHz and the P2 can run to 200+MHz
I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.
I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.
I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.
That depends on a couple of things
* Does the faster clock of P2, give more bits before it hits the 1/f noise effect, or does sysCLK not matter above 80MHz**
* Can SW add extra bits after these filters ?
If SW cannot easily add more bits, that makes a stronger case for more flops.
Still, sinc2 up to 14b (with SW extension ?) and sinc3 up to 18b covers a lot of use cases.
The important one I see is the 16b external ADCs, as there are many of those, and above that is 'nice to have'.
Did you check settling times on the sinc3, and ability to eventually resolve to any DC value ?
** Can you fudge the FPGA test system, so it can capture from the test chip at higher clock speeds ( while still running the P2 core at FPGA-80MHz.)
That may even just be a separate capture verilog step, to give a stored bitstream, and you could cross-check with a real Higher-MHz P2 capture, then fed into the FPGA test bench ?
Comments
Have you considered doing a 1024 point sinc2; and then take the output of that and run it through one of Saucy's cascaded moving average filters with a [1,6,15,20,20,15,6,1] kernel; or equivalent; that is to say if you have a working version of cma that runs on hardware. [1,6,15,20,20,15,6,1] sums to 64, and with 10 bit input; where every bit is counted and you have counted every bit - you should get full coverage as far as "missed codes" are concerned; but the nice part is you don't have to wait 32K to read the latest long count; just like the stock market where you have new 100 day moving average at the close of every trading day.
Another path to 16b, if that is too costly fully in HW, would be to manage the 14b in HW, and add a SW filter ?
I presume the sinc2 filter can work with external CLK.DAT signals, as does sinc3 ?
There will be solutions that are combinations of the HW filters, and SW post-processing ?
The easiest to use modes will be the ones supported by hardware, but higher performance / combination cases can be mixed, and the HW filters can lower the data rates down to where SW can manage.
Here is an issue, though... In order to fit everything, I can't have a random reload counter. I need to fix conversions to 2^n clocks. Is that a problem?
I suppose that the question I should be asking then is what to expect in the hardware and at what rate? Are we going to get sinc2 summed to 6 or 7 bits (including the overflow) at sysclock/4, or sysclock/8 or sysclock/16; with or without overlap; with or without sinc3 of some flavor - which should work quite nicely and would make very happy, since I know that FFT is already possible for audio processing; even without the CORDIC or Goertzel modes, yet having a faster processor with pipelined CORDIC alone should make orders of magnitude better FFT performance possible. I just don't want to be stuck with a hardwired A/D that only gives me a sample once every 32768 clocks; and where I cant get under to the hood, so to speak so as to be able to run my own algorithms, which I know will run even a P1 - it is just the fact that there are the various bottlenecks - like not having USB 2.0 at even 12MBits; which would allow me to do things like ADAT transcoding; even if that is not propeller one's fault, but if it is just the way that the FTDI chip works, etc. -- and more pins!!! I am sure you know the run down.
Otherwise, I think that some of the cma(x,y,z) stuff would be a great OBEX software cog, as it is probably premature to think about asking for a spin instruction that just takes the parameters and runs with it in optimized PASM; just as it would also be premature to think of asking for FFT(x) as something that's easy to use and precompiled in the spin core, since some people might want or need a polyphase filter instead, or a chirp, z, and there are bazillion FFT's out there free for the downloading.
I have found that, in practice, it is important to take a big bite on these sinc summations. Small ones have almost no filtering effect.
Hmm, that's unfortunate, as the SysCLK may be constrained by more important things, and it is nice to be able to reject certain frequencies like 50Hz or 60Hz mains, or other intruding frequencies.
Maybe it can be made 'near enough' by using a nearest count in the SW side ?
eg taking your 14bit example, that runs I think at 102.4us sample rate so 976.5625 of those is 10 sps, needing 977 as nearest whole number average for a notch.
or 163 or 195 whole numbers, for someone focusing on higher sps and 60Hz or 50Hz rejects.
Looks to give > 55dB error from whole cycle times, so maybe that's good enough ?
Of course, but at some point you need to make the cut, and some things are done in HW, and some are done in SW. It is important to keep a careful eye on HW costs, to avoid blow-outs.
I think it's now worth testing how these sinc2 & sinc3 & scope filters work with some software situations.
Can you do a table to summarize the ADC filters and bit-ranges and samples rates of the filters that you have packed into the smart pin, & their tested status ?
I will soon. Still a little experimentation left.
I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.
I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.
I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.
The search for >14-bit is probably futile. The TI AMC1035 has a differential-inputs, chopped with a second-order modulator and in Sinc3 mode at decimation rate 512 the ENOB is ~13.75.
Merging Sinc2 and Sinc3 makes complete sense, as the third integrator adder for the latter can be repurposed as the differentiator for the former (with room to spare). One of us should have thought of that earlier.
Is the load counter necessary? Couldn't the output be read at any time with config bits specifying the scaling/bit shifting? It would be down to the user to select the correct scaling but it would permit rounding.
Will scope mode still have four different filters?
The load counter can publish and zero the final integrator on every Nth clock, giving you a whole period in which to pick up the result before it's overwritten again. Jmg thought it would not be a good idea to fix the sample-count possibilities to the obvious 2^n points, as this would preclude custom clock counts, which I understand. We will be able to read the final integrator at any time if we set the time frame to 1 count, which will inhibit the zero'ing operation, so you can keep track, yourself.
Oh, I want to also have a mode that just shifts in 32 consecutive ADC bits and publishes them every 32 clocks. That way, it will be really simple to get a bitstream recorded for custom processing. This could record 8 pins in a way much more useful than the streamer could:
Scope mode can still have four different filters. Something like ~22, ~32, ~48, and ~64-tap modes. Scope mode is all about phase, not so much about level. I should have realized that earlier.
Yes, that's essential. I've thought about a single-bit scope mode in which each byte is eight successive ADC bits from a particular pin, so that four pins'-worth of 1-bit data can be streamed together. Running at 1/8th sysclock might be tricky, though.
I agree with that sort of spacing. Testing is needed to see how smaller sums compare to larger ones, bearing in mind result is 8-bit. How many shift bits are available now?
Does that includes zeroes at start and end, therefore 66 non-zero taps?
My previous post was written before seeing the edited post above it.
I think we should use the double-integrating technique, since it affords us more control over the output value. The triple-integrating technique is neat, but we can't control the output value very well, and it may require more flops, anyway.
We may already have the tap sets. Remember the ones that ended with zeroed LSB's, so that when we shifted them down, they landed at 255?
Sinc1 is just a square.
Sinc2 is convolution of square with itself -> triangle with base twice as wide as square.
Sinc3 is convolution of triangle with itself -> Looks a lot like a Gaussian with base twice as wide as Sinc2 triangle.
There were four windows that used double-integration with sums of 256, 510, 1020 and 2040. I'll have a look for them. Double-integration needs a tap when the first-derivative changes, which means slopes of +/- 1/2/3 require 12 taps. Triple integration needs a tap when the second-derivative changes, so we can have upslope increasing/constant/decreasing, zero slope, downslope increasing/constant/decreasing and end of window with 8 taps. The constant slopes occur in the middle of the ramps and are very similar to the linear mid-ranges of Hann or Tukey windows. Also, the sum is known in advance with CMA(x,y,z) to be x*y*z and the total number of adder bits will be set by Sinc3 mode. I think ultimately it will come down to which level of integration requires the least logic.
Appears to me that this is done by adding an "integrator" before the comparator.
But, sounds like this integrator is just a capacitor...
IIRC the filter should manage the clumping ends effect better ?
I'd also like to see a dual plot of sinc2 vs sinc3, when fed with the same P2-SDM data
-Phil
The first-order modulator in the P2 does noise shaping (but a second-order can do it better).
That depends on a couple of things
* Does the faster clock of P2, give more bits before it hits the 1/f noise effect, or does sysCLK not matter above 80MHz**
* Can SW add extra bits after these filters ?
If SW cannot easily add more bits, that makes a stronger case for more flops.
Still, sinc2 up to 14b (with SW extension ?) and sinc3 up to 18b covers a lot of use cases.
The important one I see is the 16b external ADCs, as there are many of those, and above that is 'nice to have'.
Did you check settling times on the sinc3, and ability to eventually resolve to any DC value ?
** Can you fudge the FPGA test system, so it can capture from the test chip at higher clock speeds ( while still running the P2 core at FPGA-80MHz.)
That may even just be a separate capture verilog step, to give a stored bitstream, and you could cross-check with a real Higher-MHz P2 capture, then fed into the FPGA test bench ?