I'm working on a Propeller app that involves the smoothing of time-series data. One method that's often used for this is the very simplistic exponential averaging
avgi = avgi-1 * p + data * (1 - p), where 0 < p < 1
The closer p
is to 1
, the more smoothing you get. In physical terms, this is equivalent to a simple RC filter and is a type of infinite impulse response
(IIR) filter. This means that, theoretically, a datum an infinite time in the past still affects the current average. Of course, due to rounding and truncation in computer math, the effect is finite, but can be quite long.
Another type of filter is the boxcar smoother
avg = (xi-n+1 + xi-n+2 + ... + xi-1 + xi) / n
This is a form a of a finite impulse response
[FIR] filter, in that the influence of any particular datum lasts only n
times in the data stream. It's not a very nice filter, because it has some nasty bumps at the high end of its frequency response.
A binomial filter
is like a boxcar smoother, except that the past values are not all weighted the same. In fact, they're weighted by the numbers in one of the odd lines in Pascal's triangle, and divided by the sum, which is always a power of two:
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
The further down the triangle you go, the larger the coefficients become, to the point where overflow is a real problem when they're used to multiply the data points. So what I've done is to apply just the [1, 2, 1] filter to the data iteratively n
times to get to the n
th odd row of the triangle. And this just involves adding and shifting at each step.
To do this, the program uses a history array 3*n
longs long. Each successive triad contains the history of the last three data points, where the data in triad i
are the results of applying the [1, 2, 1] filter to triad i-1
. I have sat down with pencil and paper to prove that doing this twice (n = 2
) is equivalent to applying the [1, 4, 6, 4, 1] filter. So by extrapolation, I'm assured that successive odd rows fall out from further operations. The downside to doing it this way is the increased likelihood of truncation errors. But that's the price to be paid for no overflows.
Here's the program:
_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000
MAXSTAGES = 50
UNDEF = $8000_0000
long history[MAXSTAGES * 3]
pst: "Parallax Serial Terminal"
PUB start | seed, yy, zz, i, base, avg
repeat i from 0 to 511
if (i > 128 and i < 384)
base := 2000 '+1000 pulse in the middle.
base := 1000
yy := base + (?seed >> 25) + (i // 40) * 4 'Apply noise and sawtooth wave to signal.
zz := smooth(@history, yy, 30) 'Do a 30-stage binomial smoothing.
avg := (avg * 11 + yy) / 12 'Exponential smoothing for comparison.
avg := yy
'The following two subroutines set up and perform the binomial smoothing operation.
long[hist_addr] := UNDEF 'Constant used for undef, since Spin doesn't have one.
PUB smooth(hist_addr, value, n) | i
n += n << 1 'Multiply n by 3.
if (long[hist_addr] == UNDEF ) 'If first value, fill history with it.
longfill(hist_addr, value, n)
longmove(hist_addr, hist_addr + 4, n - 1) 'Otherwise shift history left by one long.
repeat i from 0 to n - 1 step 3 'Iterate the [1 2 1] filter.
long[hist_addr][i + 2] := value 'Insert value as last in local history triad.
value := (long[hist_addr][i] + (long[hist_addr][i + 1] << 1) + value + 3) >> 2 'New value := (value[i - 2] + 2 * value[i - 1] + value[i]) / 4
This uses a 30
th order binomial filter, which, if done explicitly, would involve 59 coefficients. I plotted the data it output in Excel. Here's the graph:
The major takeaways are these:
1. Both the exponential and binomial filters do a good job of cleaning up short-term fluctuations in the data.
2. The binomial filter does a much better job following data trends that the exponential filter mushes up.
3. The binomial filter entails a delay equal to n. The reason for this is that the highest coefficient is in the middle, n-1 data samples in the past from the current one. Another way to look at it is that the binomial filter considers not only the current value and values in the past, but also values in the future to compute the smoothing.