in principle the z-transform sounds quite simple: take the textbook, look for the transform, multiply, transform back, ...
But the question is still: how to understand what we are actually doing?
If you have arbitrary noise and no knowledge about the character of the signal, nothing can be done.
But if you know that your signal has certain characteristics, you can search for those, if it is not the characteristic of arbitrary noise.
So if your signal is sunlight and you wish to determine the duration of a day, integrate the amount of light (will start from zero, show a slope and finally remain constant, search for the maximum of the derivative and now determine the time distance between two maxima. Or: autocorrelate two integrals. Whatever you do: if a signal comes only once, forget about it, as you will never take any advantage of whatever knowledge, if it doesn't come again. As we in Europe say: it took a second world war to teach us, that such an event doesn't come by accident, but is born from stupidity. If something happens only once, who cares... it will never come again.

Why would you use the FIR filter followed by the median filter? I'd think you want the other way around.

That's a good point. I could apply a binomial filter around the centroid terms of the sorted array used to compute the median. It's not really a FIR filter in that case, since it's not being applied in the time domain.

If I apply a FIR filter to the median-filtered results, I'd be averaging the same values multiple times, if the median doesn't change for several steps. But maybe that's not the downside I had assumed it could be.

One thing that confuses me about the CIC filter is this:

In the "recursive running sum filter," the difference equation is:

y(n) = [x(n) - x(n - D)] / D + y(n - 1)

That's just a point-slope formula, wherein the slope off of y(n-1) is estimated by two prior points in the time series. That makes perfect sense.

But the CIC filter eliminates the 1/D factor, which would seem to entail a much larger estimation of the slope. I know I'm missing something here; I just don't know what it is.

And since median filters are non-linear it matters which order you do things in, unlike LTI operators (quantization
issues aside).

Though once you have non-linear operations you have to put up with noise being able to mix into the
wanted signal irreperably, or two signal components mixing into each other, so care is needed.

In signal processing, a nonlinear (or non-linear) filter is a filter whose output is not a linear function of its input. ...
An important example of the latter is the running-median filter, such that every output sample Ri is the median of the last three input samples ri, ri−1, ri−2.

ok, I have to admit, I didn't get the differenc between "mean" and "median". While mean I know from don't be mean to me, median I mostly know from comedian. ;-) And as I just found out: linear filters are used preferably, because we know, how to design them, and non linear filters best fit to real problems. It's like: a lie is not a lie if I like to follow the message.

If you have arbitrary noise and no knowledge about the character of the signal, nothing can be done.
But if you know that your signal has certain characteristics, you can search for those, if it is not the characteristic of arbitrary noise.

Signals are routinely extracted from arbitary noise, signals have structure, noise does not.

Noise doesn't contain information aside of "it's noise". You can characterize noise, like white noise, shot noise, grey noise etc. But if you say "it's noise", it doesn't contain more information. The color of noise gives you information about the source, but not more. If the noise changes color, you gain information about the source of noise, what again could be a signal, like it's seen in the background radiation of the universe. Without a model of the big bang, the fluctuations in the br are just noise. If you hear noise and your ear is exercised, you can hear morse code. If you are exercised, you can hear every badly tuned violine. So Phil asked for a filter, but doesn't tell, what the signal is, except: it comes from the environment.

The variations in the microwave background are definitely signal, whether or not you have a model to compare it too,
they form an image. Its a signal encoded in the amplitude of noise from different directions, but most optical images are that if you think about it.

The variations in the microwave background are definitely signal, whether or not you have a model to compare it too,
they form an image. Its a signal encoded in the amplitude of noise from different directions, but most optical images are that if you think about it.

I believe, you are wrong. The variations are just variations. Even if something exists, that varies. The image is only formed in our brain. Like the word "image" tells us: something imagined. But IF you have a model, the fluctuations of the microwave background make sense: The model is, that space expands, so the energy has more space to be in and so the temperature comes down. Whatever scenario you can imagine, only the expanding universe model allows to coincide data following the laws of nature we all believe in.

One thing that confuses me about the CIC filter is this:
In the "recursive running sum filter," the difference equation is:
y(n) = [x(n) - x(n - D)] / D + y(n - 1)
That's just a point-slope formula, wherein the slope off of y(n-1) is estimated by two prior points in the time series. That makes perfect sense.
But the CIC filter eliminates the 1/D factor, which would seem to entail a much larger estimation of the slope. I know I'm missing something here; I just don't know what it is.
-Phil

That's confusing to me, too. Even this,
y(n) = x'(n) + y(n - 1)
where x'(n) = x(n) - x(n - D)]. That is the input to the integrator stage. (Lyon's figure B is mislabeled with y(n) shown before the integrator, where it should be after.)
That recursion is a first order exponential filter. It will increase without bound unless the input has a mean of zero. How to achieve that? AC couple the input signal through a capacitor. The usual bounded form for the first order exponential filter would be y(n) = [x'(n) + y(n-1)]/2, which has gain = 1, and with a divide by 2, you wouldn't need the overall mean=0.

D is simply a scale factor for the D-point average. In integer calculations it is best to scale up anyway, which helps to avoid numerical roundoff errors.
It seems to me the time domain equations in order to keep in bounds should be
y(n) = [x(n) - x(n - D) + y(n - 1)] / 2
Which would yield a bounded result even without the input capacitor, and the result would be scaled up by factor D.

Does that make any sense? I have Lyon's book, but he makes it seem like a non-issue in terms of the z-domain transfer function. The z-transform is the discrete time version of the Laplace transform. I had a couple of classes back in college that used the z-transform for studying population dynamics, such as to say, babies are the feedback time delayed into the population. (Insect populations!) Building and turning the crank on the models seemed a lot simpler at the time, simpler than it does reading the textbook now. I built a population model on a PDP7 computer, and it really got interesting when non-linear terms were introduced in the form of saturation and limited food supply. The big thing at the time was chaotic dynamics from simple systems, intrinsic noise.

The variations in the microwave background are definitely signal, whether or not you have a model to compare it too,
they form an image. Its a signal encoded in the amplitude of noise from different directions, but most optical images are that if you think about it.

I believe, you are wrong. The variations are just variations. Even if something exists, that varies. The image is only formed in our brain. Like the word "image" tells us: something imagined. But IF you have a model, the fluctuations of the microwave background make sense: The model is, that space expands, so the energy has more space to be in and so the temperature comes down. Whatever scenario you can imagine, only the expanding universe model allows to coincide data following the laws of nature we all believe in.

Not really sure what point you are making - the signal people have devoted decades to measuring exists and has structure. Otherwise the more you observed the background the more uniform it would become. Averaged more
and more the observations converge to a more and more accurate image. This is completely analogous with
photons striking a CCD in a digital camera during the shutter time of taking a picture of some clouds!

One thing that confuses me about the CIC filter is this:
In the "recursive running sum filter," the difference equation is:
y(n) = [x(n) - x(n - D)] / D + y(n - 1)
That's just a point-slope formula, wherein the slope off of y(n-1) is estimated by two prior points in the time series. That makes perfect sense.
But the CIC filter eliminates the 1/D factor, which would seem to entail a much larger estimation of the slope. I know I'm missing something here; I just don't know what it is.
-Phil

That's confusing to me, too.

The crucial text is "If we condense the delay-line representation and ignore the 1/D scaling in Figure 2b we obtain the classic form of a 1st-order CIC filter, whose cascade structure is shown in Figure 2c."

Remember the motivation for CIC's is fast fixed-point decimation or interpolation, where you trade sample rate for
bits, so you don't actually want to throw away those bits - for instance 8 bit samples decimated by a factor of 256 needs 16 bits to represent without information loss.

Also the trick with 2's complement overflow during repeated integration not mattering crucially depends on no division or
shift-right operations.

Not really sure what point you are making - the signal people have devoted decades to measuring exists and has structure. Otherwise the more you observed the background the more uniform it would become. Averaged more and more the observations converge to a more and more accurate image. This is completely analogous with photons striking a CCD in a digital camera during the shutter time of taking a picture of some clouds!

The analogy is not perfect. The difference: a CCD has a certain size of a cell and collects photons at a low light level. So step by step the charge in a cell is accumulated and finally the noise is cancelled out. This is noise reduction with square root of events. In the case of the background, more sensitive detectors bring a better resolution of temperature measured and in the same moment the aperture is smaller, so the resolution of the detection angle is better, that means: if two spots of different temperature now can be separated, instead of a mean temperature you now measure high and low temperature. But you have to understand: what you measure in the back ground radiation is a very low level of microwave energy. The spectrum of this energy is assumed to be a black body radiation. Means: you put a model into the evaluation of your signal. Black body radiation has a maximum in energy density. As the position of the maximum relates to the temperature, in the same moment the flat top means the location of the maximum is hard to determine, you have to measure a second frequency, which necessarily is of less amplitude, to determine the relation and so the temperature of an equivalent black body radiation is determined.
In the end, as it turns out, there are very small spacial fluctuations in the temperature of the radiation, we see a picture, that reminds to the structure, we know from nuclear bombs or vulcano eruptions and so we infer the structure really is the echo of a big bang AND the structure is somehow related with the distribution of galaxies in the universe and everything fits together nicely. That is science: have a theory and find a similarity with your observations. Scientific effort is independent of wind and rain.

We should come back to the original problem: There is a signal overlayed by noise. By definition: the signal is what we want to see, the noise is what we want to exclude or separate. A convential filter removes the noise. So we do not know, how the noise is structured, the information about noise is lost. A filter is nothing but an amplifier with different amplification for different frequencies. Frequencies not influenced by the filter don't show phase shift, those suppressed also do not show phase shift, as they are no longer present. But as a delta peak or spike contains any frequency, filtering the high frequencies of the spike keeps the low frequencies that now are not different from the signal. If we are able to determine the spectrum of the noise we removed (substracting from the original signal the filtered signal) we now can see, what frequencies in the spike we miss and remove those frequencies from the filtered signal and add them to the noise filtered out. Now the noise should look exaktly the way we expect a spike to look like.
But, as I mentioned before: we always need a model of what the signal is or what the noise looks like, in this case we can create an optimal noise suppression or signal recovery. The most common way to do this is not to determine the signal, but to determine the noise. We correlate a noise model (a spike of given form) to the measured signal and remove the noise model multiplied by the correlation coefficient from the measured signal. That is a perfect solution in terms of the accuracy of the noise model event when the signal contributes to the noise spectra.
This said it's a question what your goal is: a real time filter with a phase shift as low as possible, or a post mortem analysis, that allows to compute for infinity to have the ultimate solution.

If the noise is truly random in nature then if you subtract it from your signal one time and then add it to your sample the next it has a tendency to cancel itself out or stay low to the noise floor. In order to establish any noise in the first place you must take the difference between two samples. Those samples can be adjacent samples or some N samples down the road. Regardless if you add the noise one time, you must subtract the second time. No matter how you slice it you will have 'some' amount of delay required to capture the difference in signals.

I ran another algorithm to play with the numbers again but I don't see any way to do this without some delay unless you can predict the future.

Yes, you are absolutely right, Beau. No future without the past. Any harmonic, by defining a period, is expanded to all future and was forever. That's how Fourier transform works, and that limits all theories to never fit to reality. If a signal is made great again, it must have been small after being great and will be small after being great. It just matters, if the oscillation is dampened or not.

## Comments

1,491But the question is still: how to understand what we are actually doing?

If you have arbitrary noise and no knowledge about the character of the signal, nothing can be done.

But if you know that your signal has certain characteristics, you can search for those, if it is not the characteristic of arbitrary noise.

So if your signal is sunlight and you wish to determine the duration of a day, integrate the amount of light (will start from zero, show a slope and finally remain constant, search for the maximum of the derivative and now determine the time distance between two maxima. Or: autocorrelate two integrals. Whatever you do: if a signal comes only once, forget about it, as you will never take any advantage of whatever knowledge, if it doesn't come again. As we in Europe say: it took a second world war to teach us, that such an event doesn't come by accident, but is born from stupidity. If something happens only once, who cares... it will never come again.

22,928If I apply a FIR filter to the median-filtered results, I'd be averaging the same values multiple times, if the median doesn't change for several steps. But maybe that's not the downside I had assumed it could be.

-Phil

22,928In the "recursive running sum filter," the difference equation is:

y(n) = [x(n) - x(n - D)] / D + y(n - 1)

That's just a point-slope formula, wherein the slope off of y(n-1) is estimated by two prior points in the time series. That makes perfect sense.

But the CIC filter eliminates the 1/D factor, which would seem to entail a much larger estimation of the slope. I know I'm missing something here; I just don't know what it is.

-Phil

1,981issues aside).

Though once you have non-linear operations you have to put up with noise being able to mix into the

wanted signal irreperably, or two signal components mixing into each other, so care is needed.

1,4911,200https://en.wikipedia.org/wiki/Nonlinear_filter

In signal processing, a nonlinear (or non-linear) filter is a filter whose output is not a linear function of its input. ...

An important example of the latter is the running-median filter, such that every output sample Ri is the median of the last three input samples ri, ri−1, ri−2.

1,4911,981Signals are routinely extracted from arbitary noise, signals have structure, noise does not.

Or do you mean arbitrary interference?

1,4911,981they form an image. Its a signal encoded in the amplitude of noise from different directions, but most optical images are that if you think about it.

1,4916,554That's confusing to me, too. Even this,

y(n) = x'(n) + y(n - 1)

where x'(n) = x(n) - x(n - D)]. That is the input to the integrator stage. (Lyon's figure B is mislabeled with y(n) shown before the integrator, where it should be after.)

That recursion is a first order exponential filter. It will increase without bound unless the input has a mean of zero. How to achieve that? AC couple the input signal through a capacitor. The usual bounded form for the first order exponential filter would be y(n) = [x'(n) + y(n-1)]/2, which has gain = 1, and with a divide by 2, you wouldn't need the overall mean=0.

D is simply a scale factor for the D-point average. In integer calculations it is best to scale up anyway, which helps to avoid numerical roundoff errors.

It seems to me the time domain equations in order to keep in bounds should be

y(n) = [x(n) - x(n - D) + y(n - 1)] / 2

Which would yield a bounded result even without the input capacitor, and the result would be scaled up by factor D.

Does that make any sense? I have Lyon's book, but he makes it seem like a non-issue in terms of the z-domain transfer function. The z-transform is the discrete time version of the Laplace transform. I had a couple of classes back in college that used the z-transform for studying population dynamics, such as to say, babies are the feedback time delayed into the population. (Insect populations!) Building and turning the crank on the models seemed a lot simpler at the time, simpler than it does reading the textbook now. I built a population model on a PDP7 computer, and it really got interesting when non-linear terms were introduced in the form of saturation and limited food supply. The big thing at the time was chaotic dynamics from simple systems, intrinsic noise.

1,981Not really sure what point you are making - the signal people have devoted decades to measuring exists and has structure. Otherwise the more you observed the background the more uniform it would become. Averaged more

and more the observations converge to a more and more accurate image. This is completely analogous with

photons striking a CCD in a digital camera during the shutter time of taking a picture of some clouds!

1,981The crucial text is "If we condense the delay-line representation and ignore the 1/D scaling in Figure 2b we obtain the classic form of a 1st-order CIC filter, whose cascade structure is shown in Figure 2c."

Remember the motivation for CIC's is fast fixed-point decimation or interpolation, where you trade sample rate for

bits, so you don't actually want to throw away those bits - for instance 8 bit samples decimated by a factor of 256 needs 16 bits to represent without information loss.

Also the trick with 2's complement overflow during repeated integration not mattering crucially depends on no division or

shift-right operations.

You can always scale at the end if you need to.

1,491In the end, as it turns out, there are very small spacial fluctuations in the temperature of the radiation, we see a picture, that reminds to the structure, we know from nuclear bombs or vulcano eruptions and so we infer the structure really is the echo of a big bang AND the structure is somehow related with the distribution of galaxies in the universe and everything fits together nicely. That is science: have a theory and find a similarity with your observations. Scientific effort is independent of wind and rain.

1,491But, as I mentioned before: we always need a model of what the signal is or what the noise looks like, in this case we can create an optimal noise suppression or signal recovery. The most common way to do this is not to determine the signal, but to determine the noise. We correlate a noise model (a spike of given form) to the measured signal and remove the noise model multiplied by the correlation coefficient from the measured signal. That is a perfect solution in terms of the accuracy of the noise model event when the signal contributes to the noise spectra.

This said it's a question what your goal is: a real time filter with a phase shift as low as possible, or a post mortem analysis, that allows to compute for infinity to have the ultimate solution.

6,445I ran another algorithm to play with the numbers again but I don't see any way to do this without some delay unless you can predict the future.

Link to XLS file: https://drive.google.com/open?id=1xRils1TLEscE9DuujwuAQNOr745OKTN0

1,491