Welcome to the Parallax Discussion Forums, sign-up to participate.

# Iterative binomial filter in Spin

Posts: 22,053
edited October 19
I'm working on a Propeller app that involves the smoothing of time-series data. One method that's often used for this is the very simplistic exponential averaging:

avgi = avgi-1 * p + data * (1 - p), where 0 < p < 1

The closer p is to 1, the more smoothing you get. In physical terms, this is equivalent to a simple RC filter and is a type of infinite impulse response (IIR) filter. This means that, theoretically, a datum an infinite time in the past still affects the current average. Of course, due to rounding and truncation in computer math, the effect is finite, but can be quite long.

Another type of filter is the boxcar smoother:

avg = (xi-n+1 + xi-n+2 + ... + xi-1 + xi) / n

This is a form a of a finite impulse response [FIR] filter, in that the influence of any particular datum lasts only n times in the data stream. It's not a very nice filter, because it has some nasty bumps at the high end of its frequency response.

A binomial filter is like a boxcar smoother, except that the past values are not all weighted the same. In fact, they're weighted by the numbers in one of the odd lines in Pascal's triangle, and divided by the sum, which is always a power of two:

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1

The further down the triangle you go, the larger the coefficients become, to the point where overflow is a real problem when they're used to multiply the data points. So what I've done is to apply just the [1, 2, 1] filter to the data iteratively n times to get to the nth odd row of the triangle. And this just involves adding and shifting at each step.

To do this, the program uses a history array 3*n longs long. Each successive triad contains the history of the last three data points, where the data in triad i are the results of applying the [1, 2, 1] filter to triad i-1. I have sat down with pencil and paper to prove that doing this twice (n = 2) is equivalent to applying the [1, 4, 6, 4, 1] filter. So by extrapolation, I'm assured that successive odd rows fall out from further operations. The downside to doing it this way is the increased likelihood of truncation errors. But that's the price to be paid for no overflows.

Here's the program:
```CON

_clkmode      = xtal1 + pll16x
_xinfreq      = 5_000_000

MAXSTAGES     = 50
UNDEF         = \$8000_0000

VAR

long history[MAXSTAGES * 3]

OBJ

pst:  "Parallax Serial Terminal"

PUB start | seed, yy, zz, i, base, avg

pst.start(9600)
configure_smoothing(@history)
repeat i from 0 to 511
if (i > 128 and i < 384)
base := 2000                               '+1000 pulse in the middle.
else
base := 1000
yy := base + (?seed >> 25) + (i // 40) * 4   'Apply noise and sawtooth wave to signal.
zz := smooth(@history, yy, 30)               'Do a 30-stage binomial smoothing.

if (i)
avg := (avg * 11 + yy) / 12                'Exponential smoothing for comparison.
else
avg := yy

pst.dec(yy)
pst.char(" ")
pst.dec(zz)
pst.char(" ")
pst.dec(avg)
pst.char(13)

'The following two subroutines set up and perform the binomial smoothing operation.

long[hist_addr] := UNDEF                       'Constant used for undef, since Spin doesn't have one.

PUB smooth(hist_addr, value, n) | i

n += n << 1                                    'Multiply n by 3.
if (long[hist_addr] == UNDEF     )             'If first value, fill history with it.
return value
else
longmove(hist_addr, hist_addr + 4, n - 1)    'Otherwise shift history left by one long.
repeat i from 0 to n - 1 step 3              'Iterate the [1 2 1] filter.
long[hist_addr][i + 2] := value            'Insert value as last in local history triad.
value := (long[hist_addr][i] + (long[hist_addr][i + 1] << 1) + value + 3) >> 2 'New value := (value[i - 2] + 2 * value[i - 1] + value[i]) / 4
return value
```

This uses a 30th order binomial filter, which, if done explicitly, would involve 59 coefficients. I plotted the data it output in Excel. Here's the graph:

The major takeaways are these:

1. Both the exponential and binomial filters do a good job of cleaning up short-term fluctuations in the data.
2. The binomial filter does a much better job following data trends that the exponential filter mushes up.
3. The binomial filter entails a delay equal to n. The reason for this is that the highest coefficient is in the middle, n-1 data samples in the past from the current one. Another way to look at it is that the binomial filter considers not only the current value and values in the past, but also values in the future to compute the smoothing.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
«1

• Posts: 1,700
Interesting - always on the lookout for a digital filter without multiplies for PASM use (!)

Its a bit less computational load to use [1 1] as the basis, rather than [1 2 1], twice as many
steps but only 2 x (move + add) rather than 3 x move + shift + add + add.

The moves can be eliminated by ping-ponging between two copies, some Python to
illustrate (this ought to translate to two unrolled PASM loops, one for each value of sense, meaning
cost is mainly one add instruction per [1 1] stage.)
```N = 16
state = np.zeros ([2, N], float)

sense = 0

def smooth (inp):
global sense
state [sense, 0] = inp
for i in range (N-1):
state [sense, i+1] = state [0, i] + state [1, i]
out = state [0, N-1]+state [1, N-1]
sense = 1 - sense  # swap columns
return out / (1 << N)

```
• Posts: 1,190
Hello Phil, what do you want to reach? Remove the noise and just have the sawtooth? Determine the frequency of the sawtooth? Do you need the result in realtime or is a time shift acceptable? Could you attach the raw data file?
• Posts: 22,053
edited October 19
My objective is just to remove the noise while preserving the short-term and long-term trends without smooshing them into oblivion. The filter I've presented meets that criterion for my app.

Here's the raw data generated by the above program:
```1104
1044
1100
1020
1118
1054
1105
1132
1096
1152
1124
1050
1156
1099
1113
1124
1092
1192
1122
1112
1141
1131
1209
1144
1168
1118
1198
1119
1212
1181
1221
1196
1250
1226
1235
1176
1229
1259
1197
1246
1012
1013
1015
1017
1118
1062
1111
1032
1143
1105
1122
1046
1083
1164
1147
1161
1160
1074
1156
1178
1138
1134
1140
1164
1114
1218
1167
1176
1139
1133
1244
1214
1228
1165
1187
1217
1260
1208
1262
1200
1091
1105
1048
1064
1120
1092
1034
1129
1040
1042
1100
1082
1094
1108
1137
1164
1128
1096
1100
1096
1098
1106
1201
1174
1098
1199
1204
1149
1237
1143
1169
1162
1254
1181
1262
1253
1178
1181
1243
1233
1028
1040
1109
1100
1023
1049
1125
1060
1114
2038
2147
2110
2167
2138
2125
2086
2184
2121
2073
2181
2121
2145
2111
2116
2115
2131
2222
2171
2180
2135
2231
2169
2218
2136
2231
2238
2189
2238
2276
2206
2076
2129
2035
2133
2044
2080
2062
2090
2133
2132
2126
2145
2064
2097
2162
2103
2126
2137
2170
2145
2178
2169
2159
2109
2140
2199
2212
2147
2183
2221
2145
2246
2182
2167
2224
2243
2243
2216
2195
2250
2067
2130
2081
2127
2121
2061
2053
2049
2155
2091
2042
2143
2140
2148
2078
2181
2164
2133
2093
2191
2081
2125
2101
2100
2142
2188
2223
2138
2191
2195
2247
2140
2157
2241
2214
2178
2182
2170
2265
2214
2018
2058
2051
2050
2094
2098
2054
2075
2081
2050
2146
2127
2059
2168
2148
2156
2142
2194
2161
2090
2178
2161
2148
2162
2120
2119
2167
2152
2195
2223
2234
2180
2193
2249
2229
2149
2191
2261
2258
2223
2094
2071
2118
2088
2125
2074
2067
2122
2043
2136
2049
2147
2059
2088
2149
2157
2167
2175
2122
2120
2091
2168
2174
2097
2166
2180
2121
2128
2218
2143
2145
2142
2158
2219
2236
2213
2267
2187
2271
2242
2109
2050
2064
2053
2037
2135
2129
2093
2085
2069
2107
2162
2127
2139
2060
2067
2173
2146
2190
2195
2198
2155
2201
2110
2182
2193
2137
2175
2182
2156
2236
2184
2238
2208
2245
2154
2202
2270
2246
2191
2024
2071
2134
2117
2129
2046
2048
2111
2067
2044
2046
2128
2070
2173
2076
2150
2172
2091
2152
2093
2196
2128
2163
2108
1213
1185
1111
1217
1118
1184
1147
1141
1212
1210
1262
1221
1224
1237
1262
1200
1059
1039
1088
1109
1039
1052
1138
1076
1127
1158
1166
1133
1150
1086
1097
1153
1097
1191
1191
1138
1093
1084
1120
1182
1148
1156
1121
1224
1204
1172
1161
1209
1223
1182
1220
1242
1170
1172
1251
1232
1061
1131
1112
1076
1092
1065
1058
1085
1064
1086
1140
1149
1073
1118
1151
1094
1129
1169
1088
1145
1098
1202
1207
1186
1131
1124
1179
1148
1204
1188
1242
1202
1230
1254
1206
1164
1243
1152
1247
1222
1111
1129
1083
1028
1085
1110
1068
1095
1070
1106
1088
1171
1096
1123
1121
1137
1108
1119
1141
1190
1088
1090
1108
1166
1153
1132
1202
1169
1199
1216
1177
1188
```

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 1,190
This filter removes noise to a certain level, so the slopes are less steep. There is no phase shift visible, as the original signal is delayed, otherwise the phase shift would be 15 samples. As the middle part is easily detected as signal, to apply the filter, the signal is removed and later added again, that is the reason, why the step is still sharp.
• Posts: 22,053
How did you do it? In my target app, I don't know what the signal is, so I can't remove it ahead of time. The data are a time-series, and I need to apply the filter in real time.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 488
ErNa, that is some seriously impressive filtering. Please share at least a hint?

Mike R.
• Posts: 6,378
You know, this smells like a "Golf Challenge" :-)

Here is a 4th order delta filter ...

The superimposed image at the bottom with the spike going down on the left and up on the right is the delta between Samples.
The Sample starting at about 1000 is the original data.
The proceeding samples represent each successive order of filtering. I purposefully offset the Y by 250 on each filter.

The filter consists of a FIFO window of three samples, call them A1, A2, and A3. D1,D2,and D3 are Delta values. F1 is the Filtered output. Each pass of the filter, you lose the last two data samples.

D1 = A1-A2
D2 = A1-A3
D3 = INT((D1+D2)/2)
F1 = A1-D3

Beau Schwabe -- Submicron Forensic Engineer
www.Kit-Start.com - bschwabe@Kit-Start.com ෴෴ www.BScircuitDesigns.com - icbeau@bscircuitdesigns.com ෴෴

• Posts: 22,053
edited October 20
Beau,

So,

D3 = (D1 + D2) / 2 = (A1 - A2 + A1 - A3) / 2 = A1 - (A2 + A3) / 2
F1 = A1 - D3 = (A2 + A3) / 2

This looks like the [1, 1] kernel, but peering into the future, rather than into the past, which explains the increasing negative delay with each pass. Right?

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 6,378
edited October 20
Phil,
Peering into the future is correct I guess, although technically you are still looking at the past since you lose the last two data points or they become part of the equation rather for each Delta order you implement. The idea is that you need at least 3 data points in a first in first out approach, so the output data for each order isn't valid until you have read at least 3 points, either way you lose the last two or first two depending on how you look at the incoming data stream. For the 4th order Delta you end up losing 8 data points.

Here is the same data taken to an 8th order Delta filter. 16 data points are lost at the end so the 512 data set is now 496

EDIT: losing 2 data points per order probably isn't correct. A Phase shift or delay would be more accurate. SO for each time you implement a filter stage, you introduce a delay of two samples per stage

Beau Schwabe -- Submicron Forensic Engineer
www.Kit-Start.com - bschwabe@Kit-Start.com ෴෴ www.BScircuitDesigns.com - icbeau@bscircuitdesigns.com ෴෴

• Posts: 22,053
Unfortunately, I don;t have the luxury of knowing what future data values will be, and I need to come up with a filtered value based only upon past data values. However, I might be able to predict future data values based upon a regression analysis of past values and use those as if they were actual data. Hmmm ...

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 6,378
edited October 20
Argggghh !!!! .... It's all a matter of perspective. The Delta filter doesn't know any future values. I think what is throwing you off is that the data in the graph is left justified.

Beau Schwabe -- Submicron Forensic Engineer
www.Kit-Start.com - bschwabe@Kit-Start.com ෴෴ www.BScircuitDesigns.com - icbeau@bscircuitdesigns.com ෴෴

• Posts: 1,190
Seems to be interesting matter and Beau just comes up with a very good solution. He's deep into the matter, ;-)
When it comes to filter the data, the question always is, what do you expect to happen. This sample data seems artifical, as there is a pulse, a sawtooth and noise. If you look to the derivative, Beau created, it is obvious, there are spikes in a certain distance and at certain amplitude, so these can be predicted to and seen as a part of the signal. It is a good idea to not filter those spikes out. So one could say: calculate derivative, limit amplitude to a level 3 sigma, apply a filter and remove the filtered out components from the original signal, that should suppress the noise and keep the information.

In this signal, the question is: what is the information. Is it every new sample or is the information only valid after a sawtooth completed? In the last case, the value is the average over one sawtooth and you have a time delay of 1/2 period anyway. In the other case, the history tells you, there is a systematical sawtooth and the new sample just tells you, if this sawtooth continues or ends. Such deviations can only be found, when the noise is less then the change of the signal.
• Posts: 1,190
Whenever we sample a signal, we either accept, what we get, or we look back to the past to see if the signal didn't change faster then we experienced. Being more optimistic, we also can make a projection to the future and we check, if the signal fits to our prediction. Then we have to decide, what is the "truth". A binominal filter is to my understanding a folding of the signal with the gaussian standard deviation. So this can never reproduce a step without generating a finite slope and so phase delay.
• Posts: 22,053
edited October 20
I just don't see the effect of any derivatives in Beau's filter. It's just:

F1 = (A2 + A3) / 2
F2 = (A3 + A4) / 2, etc.

And that's the same as Mark_T's technique for iterating the binomial filter, at least after any even number of iterations. (If you stop at an odd number, you don't get a true binomial filter.)

Ideally, the filter I use, would look something like this:

fi = f(a0*xi + a1*xi-1 + ... + an*xi-n)

The binomial filter fits the bill. But, because the middle a terms are the largest ones, they contribute to a large phase delay. What I propose is to come up with a binomial filter whose left tail is applied to past values, the center term to the current value, and the right tail to "future" values extrapolated from past values, using a polynomial or other regression technique. That way, the phase delay is minimized.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 1,190
Phil, the problem you show is quit simply to solve: just state what the next value is to be and if you do this for a long time, the signal will follow, as the signal is dump like a bread. As we are older and know about the conservation of momentum and energy, we do not follow this advice of mine. We should argue more reasonable:
What we actually do when applying a filter, we write one vector of lenght 1 with the n components being the filter coefficients and then we take the last n samples of the signal as a second vector and we calculate the scalar product to get "how much" of the filter vector is "in" the signal.
Imagine the signal is constant but noisy. Then the scalar product to the middle of the binominal is just 50%. This will eliminate the noise and not introduce a phase shift, as the signal is constant, so phase shift doesn't apply. This will also work for any filter coefficients. The binomial just weights the values close to the center a little more, so if the signal is not constant, your result will be closer to the actual value of the last sample. If you could look into the future and see, that the signal is symmetrical to the past, then the next 50% of the signal will show the same value as the first 50%. But as we do not know, how the signal develops this is just lotterie.
So you could do this experiment: take 50% of the binominal and calculate the current value. Next: take 70% of the binominal, so the peak is in the past, and calculate the current signal. Now imagine to go back in time to the max of 70 %, predict what will happen 20% later, actual the known current value and determine, if you prediction was right. Then, as you know the error you made, assume that you just now made the same error and correct it.
That is just the basic principle how to make a learning system. But, always look that your predictions don't hurt conservation of momentum and energy. Or, to bring a less abstract example: if your employment rate raises constantly, there is a natural limit of 100 %. If someone tells you, that growth will continue for ever, he might be false.
• Posts: 1,700
Unfortunately, I don;t have the luxury of knowing what future data values will be, and I need to come up with a filtered value based only upon past data values. However, I might be able to predict future data values based upon a regression analysis of past values and use those as if they were actual data. Hmmm ...

-Phil

Sounds like a recipe for a nonlinear chaotic system to me... And you lose all the predictable
properties of a time-invariant linear system.
• Posts: 1,700
Or put another way: Try an IIR filter first, both approaches are going to require lots of multiple-and-add I think.
• Posts: 1,190
What is bad with guessing, what the future is, if we already guess what just happens and what was in the past. There are so many histories, that it needs scientist to establish a narrative ;-)
In the future we will have P2 that is for sure, in the past, we had no P2, that's for sure to. And in between, we wait.....
• Posts: 22,053
Mark_T wrote:
Or put another way: Try an IIR filter first, both approaches are going to require lots of multiple-and-add I think.

No!!! That's precisely what I'm trying to get away from. I'm totally committed to the FIR approach. I just need something without a huge phase lag that I can explain to my customer.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 1,190
edited October 21
Phil, the point with your signal is, that the high frequency components that produce the sharp edges also generate the noise. Removing the noise flattens the signal and so there is a time shift. Search for a saw tooth and both shift it in amplitude and time to find the optimal correlation. I suppose, the signal is not high frequency, so computational effort should not be an issue
• Posts: 1,190
edited October 22
Calculating the derivative, but D(i) = S(i) - S(i-2)
The idea behind that is: obviously, the steps in the signal (red) extend over more than one sample and so highest frequency noise (alternating) is suppressed, while the signal steps are not. By differentiation, the constant part is lost, so this part has to be added separately when reintegrating.
Integration now takes every difference, so the result has to be devided by 2 (green). To improve noise suppression, the derivative can also be averaged (in this case (d(i)+d(i-1)+d(i-2))/3 which shows the result in red.

It looks that the key with filtering is to know, what you are looking for. So if you look to your neighbor and you see a mob, take care that your sight was not filtered by a mirror.

• Posts: 1,700
Mark_T wrote:
Or put another way: Try an IIR filter first, both approaches are going to require lots of multiple-and-add I think.

No!!! That's precisely what I'm trying to get away from. I'm totally committed to the FIR approach. I just need something without a huge phase lag that I can explain to my customer.

-Phil

Then you have to put up with a small number of coefficients in the FIR - have you tried approximations to sinc()
using small integer coefficients (mainly powers of two) - for instance:

[1, 2, 1, 0, -2, -4, -2, 0, 4, 12, 16, 18, 16, 12, 4, 0, -2, -4, -2, 1, 2, 1]

Which has a cutoff of about fs/8 and latency of 10 samples - derived by adhoc means. I don't know how to
determine FIR coefficients for minimum phase, other than to simply truncate the future coefficients sooner.
• Posts: 1,700
I recently came across a description of cascaded integrator-comb filters, and this seems to be pertinent to this thread

Outside of the use for decimation and interpolation, an integrator-comb filter is another way of looking at
boxcar filtering as a combination of an integrator and a comb filter where the integrator pole cancels a single
zero of the comb filter.
• Posts: 6,284
edited November 4
Here's a thread where Marty Lawsen applied a CIC filter to improve the resolution of the prop1 Sigma Delta:

And here's a pdf link to that nice article by Richard Lyons:
https://faculty-web.msoe.edu/prust/sdr/cic_filters.pdf
• Posts: 22,053
edited November 5
Tracy, thanks for posting the link! That's a very good paper, although my eyes start to glaze over when it talks about the Z-plane.

One thing I've discovered the hard way is that so much in the filtering world is applicable to AC signals (e.g. audio, RF) only, neglecting the DC component. I thought maybe I could use my FIR filter generator to come up with a filter for the time-series data. But no. Even though the TFilter FIR filter design tool claims a gain of 1.0 for a low-pass filter, the gain at 0 Hz can be less than that, which means the filtered data will be skewed downwards. This is because the ripple in the passband may not intersect the Y-axis at 0 dB.

The simplest way to guarantee DC unity gain is to make sure that all of the FIR filter coefficients sum to 1.0. If they are all positive, so much the better. This is not the case with many TFilter-generated filters, even though the AC gain is still 1.0.

Currently, I'm looking at a combination of a causal FIR filter (i.e. one with minimum phase delay), followed by median filtering.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 1,190
Hi Phil, I feel with you, the z-plane to me also is not very intuitive. Mainly it's because of there is a theory, that reflects models that again not always reflect reality. But if you look to see DC as a very low frequency, it becomes clear, that any filter to remove those low frequencies will have a very long response time. DC only is detectable, if you know that any error is due to overlayed AC and so you can isolate only AC that is composed from many harmonics. That's why I asked: what is the signal you expect. From the example you gave, a step function and sawtooth seems to be the component of interest.
• Posts: 22,053
edited November 5
The sawtooth and step functions were just highly-adverse fake data to test various filtering strategies. The real data come from environmental sensors and are noisy but look nothing like the functions I posted.

-Phil
Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
• Posts: 1,190
That is what I could imagine. The point with filtering is you always have to know what the signal looks like and how the transmission is structured.
The most simple assumption is: noise is stochastic. In this case the filter of choise is the gaussion scaled sliding average. If noise is created by accident, then it shouldn't make any difference, which sampling points you take. But as there is a underlying signal, selecting the sampling points matters. So anyway you have to have a model of the signal generating mechanism.
• Posts: 6,284
I know what you mean about the z-domain. My son is taking a second degree, in electrical engineering (first one in music), and he is now in the throes of control systems and z transforms. I had a lot of that when I was in school but it is all very rusty--I can't give him much support. I remember there was a good ol' crank to turn for systems that involved time delays and feedback taps. The CIC-like and shaping filters are all about that.

Chopping the signal works well, to bring the signal from DC up to a set higher frequency where it can be dealt with in an exquisite narrowband filter. Unfortunately a lot of those environmental signals can't be chopped in a way that doesn't include the interference.
• Posts: 6,284
Why would you use the FIR filter followed by the median filter? I'd think you want the other way around. The median filter tends to eliminate outliners.