Welcome to the Parallax Discussion Forums, sign-up to participate.

- 101.6K All Categories
- 812 Announcements
- 59 Propeller Code
- 24 PASM2/Spin2 (P2)
- 9 PASM/Spin (P1)
- 15 BASIC (for Propeller)
- 61 Forth
- 10 C/C++
- 2.8K Propeller 2
- 27.6K Propeller 1
- 18.9K BASIC Stamp
- 10 micro:bit
- 21.1K General Discussion
- 2K Learn with BlocklyProp
- 8.2K Robotics
- 124 Customer Projects
- 3.3K Accessories

## Comments

34Have you considered doing a 1024 point sinc2; and then take the output of that and run it through one of Saucy's cascaded moving average filters with a [1,6,15,20,20,15,6,1] kernel; or equivalent; that is to say if you have a working version of cma that runs on hardware. [1,6,15,20,20,15,6,1] sums to 64, and with 10 bit input; where every bit is counted and you have counted every bit - you should get full coverage as far as "missed codes" are concerned; but the nice part is you don't have to wait 32K to read the latest long count; just like the stock market where you have new 100 day moving average at the close of every trading day.

13,69014,884Another path to 16b, if that is too costly fully in HW, would be to manage the 14b in HW, and add a SW filter ?

I presume the sinc2 filter can work with external CLK.DAT signals, as does sinc3 ?

There will be solutions that are combinations of the HW filters, and SW post-processing ?

The easiest to use modes will be the ones supported by hardware, but higher performance / combination cases can be mixed, and the HW filters can lower the data rates down to where SW can manage.

13,690Here is an issue, though... In order to fit everything, I can't have a random reload counter. I need to fix conversions to 2^n clocks. Is that a problem?

34I suppose that the question I should be asking then is what to expect in the hardware and at what rate? Are we going to get sinc2 summed to 6 or 7 bits (including the overflow) at sysclock/4, or sysclock/8 or sysclock/16; with or without overlap; with or without sinc3 of some flavor - which should work quite nicely and would make very happy, since I know that FFT is already possible for audio processing; even without the CORDIC or Goertzel modes, yet having a faster processor with pipelined CORDIC alone should make orders of magnitude better FFT performance possible. I just don't want to be stuck with a hardwired A/D that only gives me a sample once every 32768 clocks; and where I cant get under to the hood, so to speak so as to be able to run my own algorithms, which I know will run even a P1 - it is just the fact that there are the various bottlenecks - like not having USB 2.0 at even 12MBits; which would allow me to do things like ADAT transcoding; even if that is not propeller one's fault, but if it is just the way that the FTDI chip works, etc. -- and more pins!!! I am sure you know the run down.

Otherwise, I think that some of the cma(x,y,z) stuff would be a great OBEX software cog, as it is probably premature to think about asking for a spin instruction that just takes the parameters and runs with it in optimized PASM; just as it would also be premature to think of asking for FFT(x) as something that's easy to use and precompiled in the spin core, since some people might want or need a polyphase filter instead, or a chirp, z, and there are bazillion FFT's out there free for the downloading.

13,690I have found that, in practice, it is important to take a big bite on these sinc summations. Small ones have almost no filtering effect.

14,884Hmm, that's unfortunate, as the SysCLK may be constrained by more important things, and it is nice to be able to reject certain frequencies like 50Hz or 60Hz mains, or other intruding frequencies.

Maybe it can be made 'near enough' by using a nearest count in the SW side ?

eg taking your 14bit example, that runs I think at 102.4us sample rate so 976.5625 of those is 10 sps, needing 977 as nearest whole number average for a notch.

or 163 or 195 whole numbers, for someone focusing on higher sps and 60Hz or 50Hz rejects.

Looks to give > 55dB error from whole cycle times, so maybe that's good enough ?

14,88413,69014,884Of course, but at some point you need to make the cut, and some things are done in HW, and some are done in SW. It is important to keep a careful eye on HW costs, to avoid blow-outs.

I think it's now worth testing how these sinc2 & sinc3 & scope filters work with some software situations.

Can you do a table to summarize the ADC filters and bit-ranges and samples rates of the filters that you have packed into the smart pin, & their tested status ?

13,690I will soon. Still a little experimentation left.

13,690I will make the Sinc2 handle up to 14-bit conversions. I'd like to limit the Sinc3 mode to 512 counts, instead of 1024. This means we could handle 18-bit external ADC modulators, but not 20-bit, anymore.

I ran the Sinc3 mode a lot and looked at the output. It's good for 14 bits (in as little as 128 clocks!), but after that, we get the same problem as Sinc2. This doesn't have anything to do with the digital filters, but just the limitation of our ADC. It's good for 14 bits, and beyond that, it's a mess.

I will merge the Sinc2 and Sinc3 modes. They will each have an integrator dump and clear mode, and Sinc2 will be able to do fully automatic A/D conversions. If I could limit Sinc3 to 512 counts, max, it would all fit very nicely. An alternative would be to add a few flops to the smart pin scheme, maybe 5 (x64 pins = 320 flops). That would allow 20-bit external ADC's, as well as 16-bit Sinc2, in case we ever get the ADC modulator working better.

13,6901,809The search for >14-bit is probably futile. The TI AMC1035 has a differential-inputs, chopped with a second-order modulator and in Sinc3 mode at decimation rate 512 the ENOB is ~13.75.

Merging Sinc2 and Sinc3 makes complete sense, as the third integrator adder for the latter can be repurposed as the differentiator for the former (with room to spare). One of us should have thought of that earlier.

Is the load counter necessary? Couldn't the output be read at any time with config bits specifying the scaling/bit shifting? It would be down to the user to select the correct scaling but it would permit rounding.

1,809Will scope mode still have four different filters?

13,690The load counter can publish and zero the final integrator on every Nth clock, giving you a whole period in which to pick up the result before it's overwritten again. Jmg thought it would not be a good idea to fix the sample-count possibilities to the obvious 2^n points, as this would preclude custom clock counts, which I understand. We will be able to read the final integrator at any time if we set the time frame to 1 count, which will inhibit the zero'ing operation, so you can keep track, yourself.

13,690Oh, I want to also have a mode that just shifts in 32 consecutive ADC bits and publishes them every 32 clocks. That way, it will be really simple to get a bitstream recorded for custom processing. This could record 8 pins in a way much more useful than the streamer could:

Scope mode can still have four different filters. Something like ~22, ~32, ~48, and ~64-tap modes. Scope mode is all about phase, not so much about level. I should have realized that earlier.

1,809Yes, that's essential. I've thought about a single-bit scope mode in which each byte is eight successive ADC bits from a particular pin, so that four pins'-worth of 1-bit data can be streamed together. Running at 1/8th sysclock might be tricky, though.

I agree with that sort of spacing. Testing is needed to see how smaller sums compare to larger ones, bearing in mind result is 8-bit. How many shift bits are available now?

13,6901,809Does that includes zeroes at start and end, therefore 66 non-zero taps?

My previous post was written before seeing the edited post above it.13,690I think we should use the double-integrating technique, since it affords us more control over the output value. The triple-integrating technique is neat, but we can't control the output value very well, and it may require more flops, anyway.

We may already have the tap sets. Remember the ones that ended with zeroed LSB's, so that when we shifted them down, they landed at 255?

12,372Sinc1 is just a square.

Sinc2 is convolution of square with itself -> triangle with base twice as wide as square.

Sinc3 is convolution of triangle with itself -> Looks a lot like a Gaussian with base twice as wide as Sinc2 triangle.

13,69013,6901,809There were four windows that used double-integration with sums of 256, 510, 1020 and 2040. I'll have a look for them. Double-integration needs a tap when the first-derivative changes, which means slopes of +/- 1/2/3 require 12 taps. Triple integration needs a tap when the second-derivative changes, so we can have upslope increasing/constant/decreasing, zero slope, downslope increasing/constant/decreasing and end of window with 8 taps. The constant slopes occur in the middle of the ramps and are very similar to the linear mid-ranges of Hann or Tukey windows. Also, the sum is known in advance with CMA(x,y,z) to be x*y*z and the total number of adder bits will be set by Sinc3 mode. I think ultimately it will come down to which level of integration requires the least logic.

12,372Appears to me that this is done by adding an "integrator" before the comparator.

But, sounds like this integrator is just a capacitor...

14,884IIRC the filter should manage the clumping ends effect better ?

I'd also like to see a dual plot of sinc2 vs sinc3, when fed with the same P2-SDM data

23,185-Phil

1,809The first-order modulator in the P2 does noise shaping (but a second-order can do it better).

14,884That depends on a couple of things

* Does the faster clock of P2, give more bits before it hits the 1/f noise effect, or does sysCLK not matter above 80MHz**

* Can SW add extra bits after these filters ?

If SW cannot easily add more bits, that makes a stronger case for more flops.

Still, sinc2 up to 14b (with SW extension ?) and sinc3 up to 18b covers a lot of use cases.

The important one I see is the 16b external ADCs, as there are many of those, and above that is 'nice to have'.

Did you check settling times on the sinc3, and ability to eventually resolve to any DC value ?

** Can you fudge the FPGA test system, so it can capture from the test chip at higher clock speeds ( while still running the P2 core at FPGA-80MHz.)

That may even just be a separate capture verilog step, to give a stored bitstream, and you could cross-check with a real Higher-MHz P2 capture, then fed into the FPGA test bench ?