Welcome to the Parallax Discussion Forums, sign-up to participate.

• Posts: 10,097
Yup. My sentiments exactly.
• Posts: 6,432
edited 2018-11-21 - 22:44:49
Not to throw a wrench in things, but after all, that's what I'm good at...

If you have a DAC's at your disposal and a comparator (an ADC in a 1=bit mode) then you have all you need.

It becomes a simple game of guess the number in as few steps as possible (A binary Search)... The only feedback you have is if your "guess" is higher or lower than the input. Each iteration defines your resolution.

For a simplistic example lets start with an arbitrary input voltage of 2.4V

Initial State:
RefLOW = 0V
RefHIGH = 3.3V
RefMID = (RefHIGH+RefLOW)>>1

*RefMID is what the DAC is loaded with, so the initial state is 1.65V

The simple rule for each itteration is:
```IF the input voltage is greater than or equal to RefMID then
left shift a 1 into the result
replace RefLOW with RefMID
recalculate RefMID
ELSE
left shift a 0 into the result
replace RefHIGH with RefMID
recalculate RefMID
ENDIF
```

After 8 iterations you get:
```	BIT	REFlow		REFhigh		REFmid		Vin
0		3.3		1.65		2.4
BIT7	1	1.65		3.3		2.475		2.4
BIT6	0	1.65		2.475		2.0625		2.4
BIT5	1	2.0625		2.475		2.2687		2.4
BIT4	1	2.2687		2.475		2.3718		2.4
BIT3	1	2.3718		2.475		2.4234		2.4
BIT2	0	2.3718		2.4234		2.3976		2.4
BIT1	1	2.3976		2.4234		2.4105		2.4
BIT0	0	2.3976		2.4105		2.4041		2.4
```

... the binary equivalent of the 8 bits is 186 .... 186/255 * 3.3V = 2.4V

• Posts: 22,731
pedward wrote:
There's a saying about spending someone else's money. If Chip wants to do this, it's his money, his chip, his prerogative.
Well, not really. There's a whole corporate ecosystem that's been adversely affected by the delays and incessant fiddling with this chip. And that includes not only the outlay of financial capital, but also the negative impact on human capital, resulting from the necessary belt-tightening required to pay for this effort. At some point, it becomes not just a technical/financial issue, but a moral one as well.

-Phil
• Posts: 13,017
edited 2018-11-21 - 23:13:19
Beau, great idea. We could use 16-bit random-dithered mode on one pin and A>B mode on the companion pin to do a successive-approximation ADC. I never thought about it before. Not sure how precise the comparator is, yet.
• Posts: 13,017
pedward wrote: »
Chip, it sounds like if you let the ADCs free run, they will become slower and slower to respond to a voltage change; the older samples weight greater than the newer samples, correct? Or do all the samples weight evenly? I assume, because it's an accumulator, and bits march left, the older data will have more effect than newer data, assuming you have a steady signal during the sample window.

If the sample window is extended and the data is changing radically, the newer samples have the ability to out-vote the older samples, or more likely, the sample ends up becoming a rough average of the quickly changing values.

If you left the counters in free run mode, for slowly changing data, the older samples would weight the data, but also you'd end up overflowing the accumulators, so you MUST reset them periodically.

I assume that resetting the accumulator subjects it to the "prickles" of the head and tail ends of the bitstream?

Ultimately, you're accumulating samples, and the samples that are the oldest are the heaviest in weight, because they are furthest left in the accumulator. The incomplete first sample throws off the initial accumulation? How is the final sample perturbing the accumulation as well?

I don't know enough to answer any of that, yet. EvanH is doing a lot of experiments and may come up with something useful.
• Posts: 13,017
edited 2018-11-21 - 23:24:05
jmg wrote: »
cgracey wrote: »
Those tables that TonyB_ made are actually pretty efficient. I think the pure logic version is no bigger than the counter version would be.

That bit I am not quite following ?
You still need a counter at the ends, as the total sample time is fully variable, but the window ends are fixed

I would also be wary of pure logic, as that can add delays which the larger 32b adder might not tolerate. ie those big adders are likely already pushing timing.
Better to have the window values emitted from a register, so it behaves like a registered ROM/State engine in the timing.

Also, the NCO smart pin mode already has a 32b adder - is this new mode coded to carefully reuse that existing adder ?
cgracey wrote: »
I'm thinking that it might be good to investigate a 16-sample window with less resolution. While this windowing is critical, we are only talking a handful of bits in a much larger sample. I think we could give them less resolution on X and Y and it would probably be just fine. Like, instead of a 5x6, maybe a 4x4 would be sufficient.

Certainly, because this duplicates 64 times, the size needs to be pruned to the absolute minimum.

Jmg, I agree that the 6-bit weighting factor needs to be registered, ready to add at the start of the clock into the 32-bit accumulator.

I was talking about whether we should use a counter with variable increments, like TonyB_ had originally proposed, or a static logic implementation to generate the weighting factor from a counter. If we are going to have to have six flops for the waiting factor, we might as well just do a variable-increment counter, since it's smaller.
• Posts: 1,517
edited 2018-11-22 - 02:19:15
cgracey wrote: »
TonyB_ wrote: »
cgracey wrote: »
TonyB_, this Tukey data you gave me is really 180..0 degrees of a cosine plus 1, right? I was looking at the wiki and that's what I gathered.

I think you came up with a special rendition that would be easy to generate sequentially, right? The first two values are zero, making it really a 30-step window, but that helped keep the numerics simple, correct?

Chip, the Tukey ramps are ((cos -180..cos 0)+1)/2 and ((cos0..cos+180)+1)/2.

Only 28 steps are needed to get from 1 to 64, so I added two zeroes and two 64's at the start and end to make things symmetrical. I can post the logic for the accumulator increment later today. I'll check first whether having four zeroes reduces the logic, but it may well increase it.

We can't use the output of a 5-bit counter directly for the Tukey accumulator, obviously. The counter is used to generate the next 2-bit increment to add to a 6-bit accumulator.

Yes, I'm working out the adder values now:
```pos   00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
-----------------------------------------------------------------------------------------------
up    01 01 01 02 02 02 02 02 03 03 03 03 03 03 03 03 03 03 03 03 03 02 02 02 02 02 01 01 00 00 00 00
Tukey 00 01 02 03 05 07 09 0B 0D 10 13 16 19 1C 1F 22 25 28 2B 2E 31 34 36 38 3A 3C 3E 3F 40 40 40 40
dn    00 1F 1F 1F 1E 1E 1E 1E 1E 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1E 1E 1E 1E 1E 1F 1F 00 00 00
```

'up' works left-to-right.
'dn' works right-to-left.

When we are doing the main sample accumulation, we keep using the Tukey value, which will be at 64.

I'm wondering if it's best to work this windowing into the period set by WXPIN or to use 32+WXPIN+32. Either can be done with about the same logic, I think.

Here is a new, shorter Tukey window (Tukey24/64) and unlike the earlier ones it has perfect symmetry: first+last=64, second+last but one=64, etc. The number of product terms is less than before so logic is smaller. Largest increment is bigger at +/- 4.
```pos   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
-----------------------------------------------------------------------
up     +1 +1 +1 +2 +3 +3 +3 +4 +4 +4 +4 +4 +4 +4 +4 +4 +3 +3 +3 +2 +1 +1 +1
Tukey 00 01 02 03 05 08 11 14 18 22 26 30 34 38 42 46 50 53 56 59 61 62 63 64
dn      -1 -1 -1 -2 -3 -3 -3 -4 -4 -4 -4 -4 -4 -4 -4 -4 -3 -3 -3 -2 -1 -1 -1
```
• Posts: 13,017
TonyB_ wrote: »
cgracey wrote: »
TonyB_ wrote: »
cgracey wrote: »
TonyB_, this Tukey data you gave me is really 180..0 degrees of a cosine plus 1, right? I was looking at the wiki and that's what I gathered.

I think you came up with a special rendition that would be easy to generate sequentially, right? The first two values are zero, making it really a 30-step window, but that helped keep the numerics simple, correct?

Chip, the Tukey ramps are ((cos -180..cos 0)+1)/2 and ((cos0..cos+180)+1)/2.

Only 28 steps are needed to get from 1 to 64, so I added two zeroes and two 64's at the start and end to make things symmetrical. I can post the logic for the accumulator increment later today. I'll check first whether having four zeroes reduces the logic, but it may well increase it.

We can't use the output of a 5-bit counter directly for the Tukey accumulator, obviously. The counter is used to generate the next 2-bit increment to add to a 6-bit accumulator.

Yes, I'm working out the adder values now:
```pos   00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
-----------------------------------------------------------------------------------------------
up    01 01 01 02 02 02 02 02 03 03 03 03 03 03 03 03 03 03 03 03 03 02 02 02 02 02 01 01 00 00 00 00
Tukey 00 01 02 03 05 07 09 0B 0D 10 13 16 19 1C 1F 22 25 28 2B 2E 31 34 36 38 3A 3C 3E 3F 40 40 40 40
dn    00 1F 1F 1F 1E 1E 1E 1E 1E 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1D 1E 1E 1E 1E 1E 1F 1F 00 00 00
```

'up' works left-to-right.
'dn' works right-to-left.

When we are doing the main sample accumulation, we keep using the Tukey value, which will be at 64.

I'm wondering if it's best to work this windowing into the period set by WXPIN or to use 32+WXPIN+32. Either can be done with about the same logic, I think.

Here is a new, shorter Tukey window and unlike the earlier ones it has perfect symmetry: first+last=64, second+last but one=64, etc. The number of product terms are less than before so logic is smaller. Largest increment is bigger at +/- 4.
```pos   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
-----------------------------------------------------------------------
up     +1 +1 +1 +2 +3 +3 +3 +4 +4 +4 +4 +4 +4 +4 +4 +4 +3 +3 +3 +2 +1 +1 +1
Tukey 00 01 02 03 05 08 11 14 18 22 26 30 34 38 42 46 50 53 56 59 61 62 63 64
dn      -1 -1 -1 -2 -3 -3 -3 -4 -4 -4 -4 -4 -4 -4 -4 -4 -3 -3 -3 -2 -1 -1 -1
```

Beautiful, TonyB_!
• Posts: 2,367
cgracey wrote: »
Beau, great idea. We could use 16-bit random-dithered mode on one pin and A>B mode on the companion pin to do a successive-approximation ADC. I never thought about it before. Not sure how precise the comparator is, yet.

For a SAR ADC you will need a Sample & Hold circuit so that the input voltage does not change during conversion. Otherwise you get quite wrong results.

Andy
• Posts: 13,017
edited 2018-11-22 - 00:51:02
Ariba wrote: »
cgracey wrote: »
Beau, great idea. We could use 16-bit random-dithered mode on one pin and A>B mode on the companion pin to do a successive-approximation ADC. I never thought about it before. Not sure how precise the comparator is, yet.

For a SAR ADC you will need a Sample & Hold circuit so that the input voltage does not change during conversion. Otherwise you get quite wrong results.

Andy

Ah, yes. We would need some kind of switch.

I guess we could always make a tracking ADC.
• Posts: 1,517
edited 2018-11-22 - 02:19:46
cgracey wrote: »
TonyB_ wrote: »
Here is a new, shorter Tukey window and unlike the earlier ones it has perfect symmetry: first+last=64, second+last but one=64, etc. The number of product terms are less than before so logic is smaller. Largest increment is bigger at +/- 4.
```pos   00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
-----------------------------------------------------------------------
up     +1 +1 +1 +2 +3 +3 +3 +4 +4 +4 +4 +4 +4 +4 +4 +4 +3 +3 +3 +2 +1 +1 +1
Tukey 00 01 02 03 05 08 11 14 18 22 26 30 34 38 42 46 50 53 56 59 61 62 63 64
dn      -1 -1 -1 -2 -3 -3 -3 -4 -4 -4 -4 -4 -4 -4 -4 -4 -3 -3 -3 -2 -1 -1 -1
```

Beautiful, TonyB_!

And here is how the new Tukey24/64 looks, compared to the original. There are four zeroes and four 64s added merely as padding and only 24 samples are needed for each ramp (4-27 in this chart).

• Posts: 1,517
edited 2018-11-22 - 02:20:30
I hope the new Tukey24/64 performs at least as well as the earlier one. More data:
```Sample, Nearest Integer, Actual
0,  0,  0.07
1,  1,  0.61
2,  2,  1.70
3,  3,  3.30
4,  5,  5.39
5,  8,  7.94
6, 11, 10.90
7, 14, 14.22
8, 18, 17.85
9, 22, 21.71
10, 26, 25.76
11, 30, 29.91
12, 34, 34.09
13, 38, 38.24
14, 42, 42.29
15, 46, 46.15
16, 50, 49.78
17, 53, 53.10
18, 56, 56.06
19, 59, 58.61
20, 61, 60.70
21, 62, 62.30
22, 63, 63.39
23, 64, 63.93
```
• Posts: 13,017
TonyB_ wrote: »
I hope this new Tukey window performs at least as well as the earlier one. More data:
```Sample, Nearest Integer, Actual
0,  0,  0.07
1,  1,  0.61
2,  2,  1.70
3,  3,  3.30
4,  5,  5.39
5,  8,  7.94
6, 11, 10.90
7, 14, 14.22
8, 18, 17.85
9, 22, 21.71
10, 26, 25.76
11, 30, 29.91
12, 34, 34.09
13, 38, 38.24
14, 42, 42.29
15, 46, 46.15
16, 50, 49.78
17, 53, 53.10
18, 56, 56.06
19, 59, 58.61
20, 61, 60.70
21, 62, 62.30
22, 63, 63.39
23, 64, 63.93
```

What a fantastic fit!
• Posts: 1,517
Chip, I'm working on a 16 sample Tukey now - nearly done.
• Posts: 13,017
TonyB_ wrote: »
Chip, I'm working on a 16 sample Tukey now - nearly done.

Super. Maybe it could top-out at 32.
• Posts: 1,517
cgracey wrote: »
TonyB_ wrote: »
Chip, I'm working on a 16 sample Tukey now - nearly done.

Super. Maybe it could top-out at 32.

Now I've got the hang of this we could have max value of 32 or 64. I've called the new one tonight Tukey24/64 (samples/max) and we could have the following set:

Tukey16/32
Tukey16/64
Tukey24/32
Tukey24/64
• Posts: 341
I was going to request a successive approximation ADC, but Beau got to it first...
cgracey wrote: »
If we have smart pin filtering, you can pick up a filtered sample every 6,176 clocks.

Without smart pin filtering, you can capture a sample's worth of bits every 6,176 clocks and then process them in software. The time to process in software will be less than the sample time, but it would eat into your CPU time by ~20%. While that's not much, it would be a big bugaboo to have to have the filtering code resident and lose use of the streamer.
What if we could use a counter for the bulk of samples and just apply software windowing to the ends?

Doesn't solve the code resident problem, so that is one reason to have it in hardware. Another would be that it just makes the ADC easier to use.
• Posts: 1,517
TonyB_ wrote: »
cgracey wrote: »
TonyB_ wrote: »
Chip, I'm working on a 16 sample Tukey now - nearly done.

Super. Maybe it could top-out at 32.

Now I've got the hang of this we could have max value of 32 or 64. I've called the new one tonight Tukey24/64 (samples/max) and we could have the following set:

Tukey16/32
Tukey16/64
Tukey24/32
Tukey24/64

Increments:
```Name,Sample 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
Tukey16/32, 0, 1, 2, 4, 6, 8,11,14,18,21,24,26,28,30,31,32
Tukey16/64, 0, 1, 4, 7,12,17,23,29,35,41,47,52,57,60,63,64
Tukey24/32, 0, 0, 1, 2, 3, 4, 5, 7, 9,11,13,15,17,19,21,23,25,27,28,29,30,31,32,32
Tukey24/64, 0, 1, 2, 3, 5, 8,11,14,18,22,26,30,34,38,42,46,50,53,56,59,61,62,63,64
```

Product terms (PTs):
```Name,PTs:  X6,X5,X4,X3,X2,X1,X0,total
Tukey16/32, 0, 1, 3, 4, 4, 7, 2, 21
Tukey16/64, 1, 3, 4, 4, 5, 3, 4, 24
Tukey24/32, 0, 1, 3, 3, 5, 4, 7, 23
Tukey24/64, 1, 4, 4, 5, 5, 6, 2, 27
```
• Posts: 1,517
Tukey16/64 has good data agreement:
```Sample, Nearest Integer, Actual
0,  0,  0.15
1,  1,  1.38
2,  4,  3.78
3,  7,  7.26
4, 12, 11.70
5, 17, 16.92
6, 23, 22.71
7, 29, 28.86
8, 35, 35.14
9, 41, 41.29
10, 47, 47.08
11, 52, 52.30
12, 57, 56.74
13, 60, 60.22
14, 63, 62.62
15, 64, 63.85
```

There is better resolution with max of 64.
• Posts: 9,836
Anything that comes out to zero on the edges is not contributing. Either chop them off or recalculate.

• Posts: 13,017
evanh wrote: »
Anything that comes out to zero on the edges is not contributing. Either chop them off or recalculate.

I think that TonyB_ found that to get accuracy at these practical maximums of 32 and 64, the best fit was going into less than 16 or 32 positions, hence the zero fillers. I would imagine this is a net benefit to have a few zeros and get a much better fit.

To start with a '1', you must start the cosine from less than 180°, where the '1' would actually be.
• Posts: 9,836
Presumably the calculation can be done for slightly extra positions then truncate to the 16 or 32 desired. But I suppose 15 or 30 is just as easy to implement.
• Posts: 341
Chip's original trapezoidal window reduces a length-7 repeating sequence by 22dB compared to a rectangular window. For those who don't work with dB very often, that means the noise is reduced to 8% of its original amplitude.

I'm not sure yet whether the Tukey or the trapezoid is better. We really need to get an FFT of the bitstream to know how to tune the window function.

Note how Tukey24/64 has excellent rejection for a length-7 sequence. If the window function is adjustable that could be useful if some applications have high noise at particular frequencies.
• Posts: 14,474
Chip's original trapezoidal window reduces a length-7 repeating sequence by 22dB compared to a rectangular window. For those who don't work with dB very often, that means the noise is reduced to 8% of its original amplitude.

I'm not sure yet whether the Tukey or the trapezoid is better. We really need to get an FFT of the bitstream to know how to tune the window function.

Note how Tukey24/64 has excellent rejection for a length-7 sequence. If the window function is adjustable that could be useful if some applications have high noise at particular frequencies.

Don't those numbers assume the filter is always in line ?
In this ADC implementation, the filter is only present on ~1% of the samples - the first and last ones.
• Posts: 341
jmg wrote: »
Don't those numbers assume the filter is always in line ?
In this ADC implementation, the filter is only present on ~1% of the samples - the first and last ones.

I calculated with 4096 total samples, the window applies to 14-31 samples at each end. It's about 1% of samples with differing weights. So, the windowing could be done in software with little overhead if the hardware could handle the middle part that is all the same value.
• Posts: 14,474
edited 2018-11-22 - 08:54:47
cgracey wrote: »
Jmg, I agree that the 6-bit weighting factor needs to be registered, ready to add at the start of the clock into the 32-bit accumulator.

I was talking about whether we should use a counter with variable increments, like TonyB_ had originally proposed, or a static logic implementation to generate the weighting factor from a counter. If we are going to have to have six flops for the waiting factor, we might as well just do a variable-increment counter, since it's smaller.

There are various ways to pack this, and it's not so easy to detail the least-logic in the final ASIC.
I did find that a D-FF is appx 6.5 NAND gate equivalents, so registers themselves are not a high cost.

Coding the last couple of cosine 64Y tables come in at 79PT & at 60 PT, of width 8 and for the minimum of 7 registers, - which is quite few gates, in those tables.

Another approach is to code what the tools should handle well, which could be
* A 7b+4B Sync Adder, with reset (as the output) - a small adder like this, should be compact
* A 4 register U/D/enable counter, with reset, for the eg +4..-4 step sizes (MSB sign extends as needed). In T-FF's this is 2PT per counter bit, widths up to 4, so is quite compact.
* A change decoder, IncEn/DecEn that extracts just the time to change up 1 or down 1 - this packs to appx 7PT/5PT widths of 8, for the lowest wide-PT count.
• Posts: 13,017
Chip's original trapezoidal window reduces a length-7 repeating sequence by 22dB compared to a rectangular window. For those who don't work with dB very often, that means the noise is reduced to 8% of its original amplitude.

I'm not sure yet whether the Tukey or the trapezoid is better. We really need to get an FFT of the bitstream to know how to tune the window function.

Note how Tukey24/64 has excellent rejection for a length-7 sequence. If the window function is adjustable that could be useful if some applications have high noise at particular frequencies.

SaucySoliton, thanks a lot for running those analyses. Within the bitstream, there are 7..2-bt repeating patterns.

Is there much extra benefit to going beyond 32 samples? Does an 8-sample window work much worse?

It would be good to know what would be optimal.
• Posts: 1,439
edited 2018-11-22 - 08:44:26
I still am against a filter the way it is done. If you lost your key, you can search in the shining light of a street lamp or in a dark place, where you supposely lost the key.
We just now are solving problems we don't have. The propeller will never be used to sample a GHz signal like this is the case in communication networks. What we have to understand is: a value NEVER exists at a certain moment in time. That means: you can not measure a signal at a given "point" in time. That is what sampling wants to do, but never reaches. To have the best ADC you have to pay the highest price. It's just integration, that brought the price down. The type of ADC we have here is simple to understand: A signal is compared to another signal over and over. The input signal is what you want to know and the second signal is what you know already. This knowledge is the HISTORY of the unknown signal.
If you start the adc, the cap is not charged and both inputs: the signal and the compensation will charge the cap until the threshold of the comparator is reached. That moment we know: Now the input and feedback is balanced. And we just continue to keep this state by selecting a feedback according to the comparator output.
So every moment you just know: the input signal is higher or lower as the cap voltage. Only if you have knowledde of the history, how the cap was balanced, you know more exactly, what level the input voltage is. This knowledge of the history you gain from reading the feedback counter at different times.
And we are completely free to determine the time stamps. This should be according to the problem we solve.
In history, SA-ADCs were fast and cheap. Then DeltaSigma became better and cheaper. As they had to replace SA's. they were optimized to mimicry the "well understood" SA's.
There is something wrong with the streamer, there is no other reason to have the signal, the scope shows to us in start and tail. So just don't start or stop and we are fine. This ADC NEVER gives you a value, it only gives you the count of the compensation pulses you needed to balance the cap. Noise (shot noise) will always happen, but here we see a clear "signature" of the error.
There will be a next silicon it P2 is a success and if not, there will be a P2 as long as the fab makes chips. So let us focus on making the preexisting P2 a success to have the next generations!
• Posts: 9,836
edited 2018-11-22 - 09:12:33
ErNa wrote: »
There is something wrong with the streamer, there is no other reason to have the signal, the scope shows to us in start and tail. So just don't start or stop and we are fine. This ADC NEVER gives you a value, it only gives you the count of the compensation pulses you needed to balance the cap. Noise (shot noise) will always happen, but here we see a clear "signature" of the error.

Chip was cheating posting those photos. All that has happened there is unclipped roll-over on the 8-bit DACs when outputting to the scope. I thought about saying something at the time but was tired.

PS: If you look at the voltage scale of 0.5 V/div you can see the trace is swinging more than 3.0 volts. That's rail-to-rail for the DACs.
• Posts: 1,439
edited 2018-11-22 - 11:17:37
ErNa wrote: »
Brought this up again, as I added some questions
cgracey wrote: »
More pictures. These are details of 4096-sample conversions (12-bit) with the ADC receiving a finely-stepped ramp.

First, the straight accumulator approach (no windowing). The monitor DAC was wrapping vertically because of the noise amplitude:

It looks like the windowing is getting rid of sporadic +1/+2 contributions from the initial and terminal bits samples.
So, could you tell it a way, I can crasp what I see: The analog input voltage of the input is a staircase function?
You read out the counter at a given period?
You substract the last read value from the current value?

If I didn't know it better, I would think, you look to the future as you start and to the past as you stop reading ;-)

And just another weird idea: couldn't it be, that the spikes are an overflow of the scope or the ADC that output the signal?

That is very close to my guessing.